Implementing Our First Neural Network
Learn to implement, train, and validate a neural network to classify handwritten digits.
Let’s implement a neural network. Specifically, we will implement a fully connected neural network (FCNN) model.
One of the stepping stones to the introduction of neural networks is to implement a neural network that is able to classify digits. For this task, we’ll be using the famous MNIST dataset.
We might feel a bit skeptical regarding our using a computer vision task rather than an NLP task. However, vision tasks can be implemented with less preprocessing and are easy to understand.
Because this is our first encounter with neural networks, we’ll see how to implement this model using Keras. Keras is the high-level submodule that provides a layer of abstraction over TensorFlow. Therefore, we can implement neural networks with much less effort with Keras than using TensorFlow’s raw operations.
Preparing the data
First, we need to download the dataset. TensorFlow, out of the box, provides convenient functions to download data, and MNIST is one of those supported datasets.
We will be performing four important steps during the data preparation:
Download and store the data
We need to download the data and store it as numpy.ndarray
objects. We’ll create a folder named data
within our directory and store the data there.
The following code downloads the dataset:
os.makedirs('data', exist_ok=True)(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data(path=os.path.join(os.getcwd(), 'data', 'mnist.npz'))
Reshaping the image
Next, we reshape the images so that 2D grayscale images in the dataset will be converted to 1D vectors.
The following code reshapes the dataset:
# Reshaping x_train and x_test tensors so that each image is represented as a 1D vectorx_train = x_train.reshape(x_train.shape[0], -1)x_test = x_test.reshape(x_test.shape[0], -1)
Standardize the image
We standardize the images to have a zero mean and unit variance (also known as whitening).
The following code standardizes dataset:
# Standardizing x_train and x_test tensorsx_train = (x_train - np.mean(x_train, axis = 1, keepdims = True))/np.std(x_train, axis = 1,keepdims = True)x_test = (x_test - np.mean(x_test, axis = 1, keepdims = True))/np.std(x_test, axis = 1,keepdims = True)