What is batch normalization?

We know the importance of normalization as a preprocessing step in machine learning. Similarly, batch normalization (BN) is a common technique used in deep learning models to improve performance and stability. It is usually applied after the convolutional or FC layer. Its function is to normalize the outputs from the preceding hidden layer before these outputs become inputs for the next hidden layer. Batch normalization is designed to optimize the training process of neural networks by minimizing what’s known as internal covariate shift.

Internal covariate shift


An internal covariate shift refers to the change in the input distribution of a network during training. Neural networks learn by adjusting the weights of connections between neurons in response to the training data. These weights signify how each input feature influences the network’s output.

However, if the input distribution changes over time, the hidden layers must adapt their weights to learn this new distribution. This adjustment slows down the training process and may cause the model to perform poorly on unseen data. By using batch normalization, we can limit this issue, making our model more efficient and reliable. 

How does batch normalization help YOLO?

Get hands-on with 1400+ tech skills courses.