General structure

The general structure of AlexNet is as follows:

AlexNet is the image classification architecture that won the LSVRC competition in 2012.

  • The AlexNet contains eight trainable layers: five convolutional in the middle and three fully connected at the end.

  • 3 maximum pooling layers spread between convolutional layers.

  • ReLU activation function is used after each layer except for the last one.

  • For the last layer, a softmax activation function is used to obtain predictions as probabilities.

  • A dropout mechanism is used with a rate = 0.5.

  • To initialize the weights, a zero-mean gaussian distribution is used (also called normal distribution) using standard deviation = 0.01.

  • The biases are initialized with a constant value of 1.

  • The learning rate is initialized with 0.01 and divided by 10 every time the validation error rate stops improving.

  • Stochastic gradient descent with momentum is used with momentum = 0.9 and batch size = 128.

  • L2 regularization is used.

  • It’s a model created for 227x227 RGB images for 1000 classes of the ImageNet dataset. It contains ~60 million parameters. A simple view of the architecture is as follows:

Get hands-on with 1200+ tech skills courses.