Objective
Implement a data-fitting classifier that can predict the label of an input vector. In the context of hand-written digit classification, the label is an integer number between 0 to 9, and the input vector is a vectorized 28 × 28 image with each element as a floating-point number ranging from 0 to 1 (telling the grayscale of the pixel).
My role
Sole
Background
Neural networks can be implemented from scratch using mathematical modeling techniques including linear algebra, least squares, chain rule, gradient descent, and nonlinear optimization.
Least squares
Before diving into the neural network world, we first tried to solve the hand-written digit classification problems using our well-understood mathematical tool of least squares.
We first used the least-squares to classify a subset of the MNIST hand-written digit dataset.
The training dataset for this project has 1000 images, which are stored as a 1000 × 784 matrix in.
The testing dataset for this project has 200 images, which are stored as a 200 × 784 matrix.
The labels for the training and testing set are stored as 1D arrays
Mathematical model
We solved the least-squares problem 𝑋𝑊 = 𝑌 to fit our data set, with 𝑋 as a 1000 × 785 (with the bias term!) data matrix, 𝑊 as a 785 × 10 model matrix, and 𝑌 is a 1000 × 10 true value right-hand matrix composed of 1 and −1 (or any values that distinguish the two classes).
For the easiness of the implementation, we solve the problem as ten separate least-squares problems with each column (a 785 × 1 vector) as the unknowns.
After calculating the values of 𝑊, we use it to predict the class of a new image 𝑥 by a simple vector-matrix multiplication ⃗𝑣 = ⃗𝑥𝑊 and then pick the index with the maximum value from the vector 𝑝 = 𝑎𝑟𝑔𝑚𝑎𝑥 {𝑣}.
We then tested our least-squares classifier on the test dataset and report your least-squares model’s accuracy to be 0.913 for our training data and 0.56 for our testing data.