Understanding Deep Learning Applications in Rare Event Prediction/

...

Understanding Parameter Sharing, Weak Filters, and Equivariance

Learn about convolutional networks, exploring parameter sharing, weak filters, and translation equivariance.

We'll cover the following...

Parameter sharing
- The benefit of parameter sharing
Weak filters
Equivariance to translation
- Is location preservation important?

A convolution process sweeps the entire input. Sweeping is done with a filter smaller than the input. This gives the property of parameter sharing to a convolutional layer in a deep learning network.

Moreover, the filters are small and, therefore, called weak. This enables sparse interaction. Lastly, during the sweep, the location of filter patterns in the input is also returned, which brings the equivariance property to convolution.

This section explains the benefits, downsides, and intentions behind these convolution properties.

Parameter sharing

The property of parameter sharing is best understood by contrasting a dense layer with a convolutional layer.

Suppose a dense layer is used for the image detection problem in the previous lesson. A dense layer would yield a filter of the “same shape and size” as the input image with the letter 2, as shown in the illustration below.

Press + to interact

An illustration of a filter in a dense layer to detect the letter “2.” The filter size is the same as that of the input. Consequently, the parameter space is significantly large. This is also referred to as a strong filter because it can alone detect the letter “2.” Although the filter is strong, the excess amount of parameters in it makes a dense layer statistically inefficient. Due to statistical inefficiency, a dense layer would require a large number of samples to automatically learn a filter.

If we compare the sizes of the dense layer filter in the figure above with either of the convolutional filters: semicircle and angle. The latter is clearly smaller.

The convolutional filter is “smaller” than the input. To cover the entire input, it sweeps through it from top to bottom and left to right. This is called parameter sharing.

A filter is a kernel made of some parameters. When the same filter is swept on different parts of the input, it is called parameter sharing. are shared. This is in contrast to a dense layer filter which is as big as the input and, therefore, does not share parameters.

The benefit of parameter sharing

Parameter sharing makes a convolutional network statistically efficient. Statistical efficiency is the ability to learn the model parameters with as few samples as possible.

Due to parameter sharing, a convolutional layer can work with small-sized filters—smaller than the input. The small filters have fewer parameters compared to an otherwise dense layer. For instance, if the input image has $m × n$ pixels, a filter in the dense layer (the weight matrix) is also $m × n$ ...

Getting Started

Rare Event Prediction

Multi-Layer Perceptrons (MLPs)

Long Short-Term Memory (LSTM) Networks

Convolutional Neural Networks (CNNs)

Autoencoders

Conclusion

Understanding Parameter Sharing, Weak Filters, and Equivariance

Parameter sharing

The benefit of parameter sharing