Skip to content

A simple neural net Python script for training on the MNIST dataset given to Claude Code with instructions to significantly improve the model.

Notifications You must be signed in to change notification settings

oliviahelens/Python_MNIST_CC

Repository files navigation

Python_MNIST

Overview

After building a simple neural network that recognizes handwritten digits (0-9) from scratch using GitHub Codespaces, Claude Opus 4.5, and the MNIST dataset, this repo was given to Claude Code with a mandate to improve the model.

Result: Model accuracy improved from ~80% to 98.5% on MNIST test set through basic modern deep learning techniques.

Model Architecture

  • Input: 784 pixels (28x28 grayscale image)
  • Hidden layers: 4 layers with 512 → 384 → 256 → 128 neurons
  • Output: 10 neurons (digits 0-9) with softmax
  • Activation: ReLU with 30% dropout
  • Initialization: He initialization for improved gradient flow

Training Features

  • Mini-batch SGD (batch size 128) with data shuffling
  • Learning rate scheduling with decay at 50%, 70%, and 90% of training
  • 80 epochs with validation monitoring
  • Dropout regularization to prevent overfitting

Structure

  • network.py - Deep neural network class with forward/backward pass
  • train.py - Data loading, training loop with validation
  • predict.py - Image preprocessing and prediction with test-time augmentation

Dataset

MNIST - 70,000 labeled images of handwritten digits (28x28 grayscale)

Requirements

  • Python 3.x
  • NumPy
  • Pillow
  • SciPy

About

A simple neural net Python script for training on the MNIST dataset given to Claude Code with instructions to significantly improve the model.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages