Pure-MNIST — a network with nothing to hide
A neural network with nothing to hide
- ROLE
- Solo build
- TIMEFRAME
- 2024
- STACK
- Python, NumPy
- LINKS
- github ↗
0 deps
NUMPY ONLY
The problem
Frameworks make backpropagation easy to use and easy to never understand. The point of this build was to be unable to hide behind loss.backward().
Approach
A 784–128–10 network implemented as a modular layer API in NumPy only: DenseLayer owns the weights and the Y = XW + b forward pass, ReLU handles the dead-neuron gradient mask, and Softmax is fused with cross-entropy for numerical stability. Weights start from He initialization; every gradient is derived by hand from the chain rule and applied through vectorized matrix updates — no autograd anywhere.
Results
[Test accuracy from the training run — and how close it lands to an equivalent PyTorch baseline.]
What broke
[The gradient bugs you found and how you caught them — gradient checking? exploding ReLU? learning-rate cliffs?]