R is a programming language mainly used for statistical computing and data analysis. It offers numerous libraries and packages to facilitate machine learning operations, such as digit classification. Digit classification is a fundamental task in computer vision and machine learning. A model must be trained to identify and categorize handwritten digits into their corresponding values.
This Answer will walk us through creating a digit classifier in R using the
A key process in any data analysis process is data preparation, which makes sure the data is ready for analysis. After gathering the dataset, the first step is to prepare and upload the data into the working environment. As mentioned earlier, we’ll be using the MNIST dataset, which contains grayscale images of handwritten digits (0–9). Let’s load the dataset and split it into training and testing sets:
mnist <- dataset_mnist()x_train <- mnist$train$xy_train <- mnist$train$yx_test <- mnist$test$xy_test <- mnist$test$y
Data preprocessing ensures that the data we use for machine learning or analysis is clean, consistent, and error-free. It’s important to preprocess the data before training a model or analyzing data, as it directly impacts the model’s performance or analysis. Normalization is a preprocessing technique that scales the data to a range between 0 and 1.
x_train <- x_train / 255x_test <- x_test / 255
Once the data is ready for training, we need to define the architecture of our model to train our data on and use it later for classification. We’ll create a simple convolutional neural network (CNN) for digit classification as it’s well-suited to handle the image data. A CNN consists of the following layers:
Input
Convolutional
Max-pooling
Fully connected
Output layer
Below is a code snippet that shows how to build a basic CNN model in R using Keras:
model <- keras_model_sequential()model %>%layer_conv_2d(filters = 32, kernel_size = c(3, 3), activation = 'relu', input_shape = c(28, 28, 1))model %>%layer_max_pooling_2d(pool_size = c(2, 2))model %>%layer_flatten()model %>%layer_dense(units = 128, activation = 'relu')model %>%layer_dense(units = 10, activation = 'softmax')
Once the model is defined, the next step is to compile it by specifying the loss function, optimizer, and evaluation metric:
model %>% compile(loss = 'sparse_categorical_crossentropy',optimizer = 'adam',metrics = c('accuracy'))
The next step is to train the model on training data, and we specify the number of epochs and batch size for the model.
epochs <- 10batch_size <- 64history <- model %>% fit(x_train, y_train,epochs = epochs,batch_size = batch_size,validation_split = 0.2)
The final step after training our model for digit classification is to evaluate our model on test data. The code is given below:
eval_result <- model %>% evaluate(x_test, y_test)cat("Test accuracy:", eval_result$accuracy, "\n")
Softmax function: It’s often used in the neural network’s output layer to handle multi-class classification problems. We used the MNIST dataset, which contains 10 classes from 0 to 9, to classify digits. It predicts the probability of occurrence of each class, and the sum of probabilities for all classes is equal to 1.
Adam optimizer: The reason to use the Adam optimizer is that it produces faster computation times, requires fewer tuning parameters, and performs better. It’s the default optimizer for most of the classification problems.
Sparse categorical cross entropy: It’s suitable for multi-class classification where the target labels are integers. It’s an extension of the cross entropy loss function used for binary classification problems.
Evaluation metric: We used accuracy
as an evaluation metric because it provides a simple and intuitive way to measure the performance of the model. It’s a commonly used metric in classification problems. We can use other metrics as per our requirements.
Epoch: It refers to the training dataset passed through the neural network during training. Underfitting or overfitting can be avoided by selecting the appropriate number of epochs. In our example, we set epoch=5
because we have little computational power, and we also achieved good accuracy.
Batch size: It refers to the number of samples passed to the network before updating model parameters. There is a trade-off between accuracy and speed: Large batch sizes can lead to faster training times but might result in lower accuracy, while smaller batch sizes can provide better accuracy but can be time-consuming. In our example, we used batch_size=32
and got good results.
The complete code of the digit classifier in R is given below. Press the “Run” button to train a model on the MNIST dataset and find its accuracy.
# Install and load necessary libraries library(keras) library(reticulate) # Install TensorFlow py_install("tensorflow") # Load the MNIST dataset mnist <- dataset_mnist() # Split the data into training and testing sets x_train <- mnist$train$x y_train <- mnist$train$y x_test <- mnist$test$x y_test <- mnist$test$y # Normalize the pixel values to be in the range [0, 1] x_train <- x_train / 255 x_test <- x_test / 255 # Create a neural network model model <- keras_model_sequential() %>% layer_flatten(input_shape = c(28, 28)) %>% layer_dense(units = 128, activation = 'relu') %>% layer_dense(units = 10, activation = 'softmax') # Compile the model model %>% compile( loss = 'sparse_categorical_crossentropy', optimizer = optimizer_adam(), metrics = c('accuracy') ) # Train the model model %>% fit(x_train, y_train, epochs = 5, batch_size = 32) # Evaluate the model evaluation <- model %>% evaluate(x_test, y_test) print(evaluation) cat(evaluation)
Free Resources