What is the MNIST dataset in Keras?

The MNISTModified National Institute of Standards and Technology dataset is a built-in dataset provided by Keras. It consists of 70,000 28x28 grayscale images, each of which displays a single handwritten digit from 0 to 9. The training set consists of 60,000 images, while the test set has 10,000 images.

The handwritten digits have been normalized and centered on maintaining consistency. The load_data() function is used to load the dataset from keras.

Syntax

tensorflow.keras.datasets.mnist.load_data(path="mnist.npz")

Arguments

  • path: the relative path where to cache. This parameter is optional.

Return value

It returns two tuples with NumPy arrays. The tuples are in the form (X_train, y_train), (X_test, y_test).

  • X_train: Training data that consists of grayscale images. It has the shape (60000, 28, 28) and the dtype of uint8. The pixel value varies between 0 to 255.
  • y_train: Training labels with integers from 0-9 with dtype of uint8. It has the shape (60000,).
  • X_test: Testing data that consists of grayscale images. It has the shape (10000, 28, 28) and the dtype of uint8. The pixel value varies between 0 to 255.
  • y_test: Testing labels that consist of integers from 0-9 with dtype uint8. It has the shape (10000,).

Code

import tensorflow as tf
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()
print('Shape of X_train: ', X_train.shape)
print('Shape of y_train: ', y_train.shape)
print('Shape of X_test: ', X_test.shape)
print('Shape of y_test: ', y_test.shape)

Free Resources