General structure

The general structure of AlexNet is as follows:

AlexNet is the image classification architecture that won the LSVRC competition in 2012.

The AlexNet contains eight trainable layers: five convolutional in the middle and three fully connected at the end.
3 maximum pooling layers spread between convolutional layers.
ReLU activation function is used after each layer except for the last one.
For the last layer, a softmax activation function is used to obtain predictions as probabilities.
A dropout mechanism is used with a rate = 0.5.
To initialize the weights, a zero-mean gaussian distribution is used (also called normal distribution) using standard deviation = 0.01.
The biases are initialized with a constant value of 1.
The learning rate is initialized with 0.01 and divided by 10 every time the validation error rate stops improving.
Stochastic gradient descent with momentum is used with momentum = 0.9 and batch size = 128.
L2 regularization is used.
It’s a model created for 227x227 RGB images for 1000 classes of the ImageNet dataset. It contains ~60 million parameters. A simple view of the architecture is as follows:

Softmax activation function

Even though the different types of activation functions and their advantages—disadvantages are not the subject of this course, we will discuss softmax. The softmax activation function is the most common one for classification architectures to use at the end, and it’s essential to understand how it works.

Softmax is a particular type that guarantees the sum of all the output nodes will be 1 while the values of output nodes will stay in the range [0, 1]. Therefore, softmax converts the problem to a probability case, which is very useful for classification problems when it’s used as the activation function of the last layer.

Dropout mechanism

The dropout mechanism is a regularization technique where we determine a proportion of the nodes that the model will ignore during one iteration of the training step. For example, we have a dropout proportion of 0.3 and 10 nodes in our model; in every iteration, the model chooses three nodes to ignore and makes the forward calculation without considering them. In the next iteration, these three nodes to be ignored are changed randomly.

It is an important regularization because it helps address the overfitting problem in neural networks. Overfitting occurs when the model performs well on the training data but fails to generalize to unseen data.

The following visualization shows a layer having a ...

Before We Start

Basics of Convolutional Neural Networks

Cats vs Dogs Classification with Convolutional Neural Networks

Popular Neural Network Architectures for Image Classification

Using PyTorch for Image Classification

Model Deployment

Using a PyTorch Model in JavaScript with ONNX

Basics of Object Detection

Two-Stage Object Detection Architectures

One-Stage Object Detection Architectures

YOLOv7 Model Train and Inference on Edge

Conclusion

Appendix

Building a System for Safety Helmet Detection Based on YOLOv5

AlexNet (2012)

General structure

Softmax activation function

Dropout mechanism