Using PyTorch for Image Classification and Object Detection/

...

Basics of Neural Networks

Build on your foundational knowledge of neural networks by learning more advanced concepts and techniques in this lesson

We'll cover the following...

From artificial intelligence to deep learning
General structure of a fully connected neural network
- Basics of a neural network
- Visualization of a neural network
Different training methods
Supervised learning fundamentals
Training cycles

From artificial intelligence to deep learning

Artificial intelligence, one of the research areas that attract excellent attention today, brings some concepts that we often hear. Knowing and distinguishing the meaning of these terms and subcategories would help to understand which category we will work within this course.

Artificial intelligence: The theory and development of computer systems that can perform tasks usually requiring human intelligence.
Machine learning: This is a subcategory of artificial intelligence that allows computers to learn without being explicitly programmed.
Deep learning: This is a subcategory of machine learning. It involves algorithms with brain-like logical structures called artificial neural networks.

We can produce various projects belonging to the different subcategories using these deep neural networks.

Suppose the deep neural network is created to process images or videos. In that case, it belongs to the computer vision category, whereas text or audio processing are other subcategories that can use deep neural networks.

Press + to interact

General structure of a fully connected neural network

A neural network is a computing system inspired by the biological neural networks that constitute the human brain. A neural network is based on a collection of connected nodes called artificial neurons, which loosely model the neurons in a physical brain. An artificial neuron receives a signal then, processes it, and can signal other neurons connected to it.

Basics of a neural network

The basics of a neural network are as follows:

Neuron: A node carrying the signal.
Weight: A coefficient that determines the effect of the signal.
Activation function: A particular function used to calculate output signal from input signals.
Bias: A coefficient to be added to the sum of input signals before going to the activation function. Each output neuron has its own bias.

Press + to interact

We have a simple neural network structure with three input nodes having special weights for each, one bias for the output node, and an activation function to automatically calculate the price of a house. Whereas the first node represents the size of the house, the second node represents the score out of 10 for the location, and the third represents the age. Let’s consider the house properties and the trained network parameters as follows:

$n_1 = 160$
$n_2 = 8$
$n_3 = 12$
$w_1 = 3$
$w_2 = 2$
$w_3 = -4$
$\text{bias} = 15$
$f = x^2$

The output node would be calculated as follows:

So the trained model tells us the optimal sale price of this house should be 463k dollars.

Whether this prediction is accurate, we have learned a simple neural network structure and its components. Additionally, if each input neuron in a layer is connected to each output neuron in the next layer, it’s called a fully connected layer. To summarize, the models whose structure is established using neurons in this way—with fully connected layers—are called fully connected neural networks. A simple fully connected neural network with four layers is shown below:

Press + to interact

Different training methods

Apart from the structure of the model, there is another step that divides the neural network models into two different subcategories, i.e., the training method. In the above example, we considered a trained model giving us a prediction about a house price where the model uses its weights and bias to make this prediction. Training is the step of preparing these weights and biases by the model itself so that the model can use these learned parameters for inference, which refers to the time to use the prepared model in real life.

Supervised and unsupervised methods are the two main methods to train a model. If we train a neural network by showing the correct answer, that is a supervised neural network, and the learning type is supervised learning. If we train a neural network by not showing the correct answer, that is an unsupervised neural network, and the learning type is unsupervised learning.

Although there are some other and more complex training methods, we will apply supervised learning using images and their labels as ground truth answers in this course. Therefore, it is necessary to understand the main logic behind supervised learning, as explained below.

Supervised learning fundamentals

If we give the true answer to our question and let the model calculate the difference between that true answer and the prediction made, then use that difference to update its weights and bias, this is supervised learning, and we call that model a supervised neural network.

There are various cost/loss functions to calculate the difference between the prediction and the ground truth (true answer). We feed this difference into a gradient descent algorithm, which provides us with the value to subtract from our weights or bias to update them.

Cost function

The cost function is used to calculate the difference between the prediction and the ground truth (true answer). Among various types, the cost function can be chosen according to the model’s prediction type. For example,

For a regression problem, mean squared error is commonly used.
For a classification problem, cross-entropy loss and taking the softmax activation function’s output would be preferable.

Gradient descent

The next step is to decide how to use this error-loss calculated by the cost function to update our weights and biases. A standard answer for this with neural networks is to use gradient descent. Gradient descent is the calculation of the loss derivative for bias and weights. Finding optimal weights and biases during our neural network training is a minimization problem, where we try to find a local minimum for our error.

Press + to interact

Backpropagation

Feeding the model with input data and moving forward from input to output nodes is also called forward propagation. On the contrary, after calculating the gradient descents for each weight and bias in our layers, we have to go backward to update them. This process is called backpropagation.

The update operation involves subtracting the gradient descent multiplied by a learning rate from the weight and assigning it as the new value of this specific weight.

The learning rate is a coefficient that determines the size of the step we try to take while moving through the minimal cost.

Press + to interact

Training cycles

We covered the main steps and methods applied during training. It’s also essential to know some terms used to express when and how many times we apply these steps during the training.

One epoch means one fully completed training cycle. During one epoch, all the input data we have should be uploaded once. We can upload our data one by one or with batches where the iteration and batch terms come up. One iteration is the process of uploading one batch to the network, calculating the mean loss of given data in this batch, and updating the weights and biases with backpropagation. When all the epochs are completed, the training is done.

So let’s say we train a classification model using images and have 1000 different images in our dataset. We decide to feed the network with batch size of 5. It means we will give five images to the network one by one, calculate the loss for each, and take the average of this loss to calculate gradient descent to apply finally backpropagation. In that case, one iteration is completed when these five images are used, and then we pass to the next iteration. Since we have to give all the data to the network to complete one epoch and give them five by five, we have 200 iterations in 1 epoch.

“How many epochs should we train our model?” or “How much should be our batch size?” are model and dataset-specific questions, and we have to fine-tune our model to find optimal answers. We will see some examples of how to fine-tune our trained model.

Press + to interact

Take-Away Vocabulary

Input layer	It consists of input nodes, the data we want to process.
Hidden layer	It is the middle one connecting the signals from the input layer to the output layer. It can consist of from one to a considerable number of layers.
Output layer	It is the last one that holds our final result.
Supervised learning	Training method by showing the ground truth to the model along with data.
True answer (ground truth)	The expected prediction—label from the model for given data.
Fully connected neural network	It is where all the nodes are connected to the following layer’s nodes.
Cost function (loss function)	The function to calculate the error between true answer and prediction.
Gradient descent	Step for calculating cost derivative via weights and bias.
Backpropagation	The stage of updating the weights using gradient descent.
Epoch	One cycle of training.
Iteration	One step inside of epoch feeding the network as much as images with batch size.
Batch size	The number of images to send through the network one by one to calculate the mean loss for updating weights and biases.

Before We Start

Basics of Convolutional Neural Networks

Cats vs Dogs Classification with Convolutional Neural Networks

Popular Neural Network Architectures for Image Classification

Using PyTorch for Image Classification

Model Deployment

Using a PyTorch Model in JavaScript with ONNX

Basics of Object Detection

Two-Stage Object Detection Architectures

One-Stage Object Detection Architectures

YOLOv7 Model Train and Inference on Edge

Conclusion

Appendix

Building a System for Safety Helmet Detection Based on YOLOv5

Basics of Neural Networks

From artificial intelligence to deep learning

General structure of a fully connected neural network

Basics of a neural network

Visualization of a neural network

Different training methods

Supervised learning fundamentals

Cost function

Gradient descent

Backpropagation

Training cycles

Take-Away Vocabulary