AlexNet (2012)
Learn the fundamentals of the AlexNet image classification architecture.
General structure
The general structure of AlexNet is as follows:
AlexNet is the image classification architecture that won the LSVRC competition in 2012.
-
The AlexNet contains eight trainable layers: five convolutional in the middle and three fully connected at the end.
-
3 maximum pooling layers spread between convolutional layers.
-
ReLU activation function is used after each layer except for the last one.
-
For the last layer, a softmax activation function is used to obtain predictions as probabilities.
-
A dropout mechanism is used with a rate = 0.5.
-
To initialize the weights, a zero-mean gaussian distribution is used (also called normal distribution) using standard deviation = 0.01.
-
The biases are initialized with a constant value of 1.
-
The learning rate is initialized with 0.01 and divided by 10 every time the validation error rate stops improving.
-
Stochastic gradient descent with momentum is used with momentum = 0.9 and batch size = 128.
-
L2 regularization is used.
-
It’s a model created for 227x227 RGB images for 1000 classes of the ImageNet dataset. It contains ~60 million parameters. A simple view of the architecture is as follows:
Get hands-on with 1400+ tech skills courses.