Autoencoders as Neural Network Version of PCA

Exploring autoencoders as a neural network version of PCA for dimension reduction and feature representation.

Background

An autoencoder is a reconstruction model. It attempts to reconstruct its inputs from itself as depicted below:

xf(x)zencodingg(z)x^decoding.x → \underbrace{f(x) → z}_{\text{encoding}} → \underbrace{g(z) → \hat{x}}_{\text{decoding}}.

An autoencoder is made of two modules: encoder and decoder.

As their names indicate, an encoder ff encodes input xx into encodings z=f(x)z = f(x), and a decoder gg decodes zz back to a closest reconstruction x^\hat{x}.

Training a model to predict (construct) x^\hat{x} from xx sometimes appears trivial. However, an autoencoder does not necessarily strive for perfect reconstruction. Instead, the goal could be dimension reduction, denoising, learning useful features for classification, pretraining another deep network, or something else. Autoencoders, therefore, fall in the category of unsupervised and semi-supervised learning.

Autoencoders and principal component analysis

Autoencoders were conceptualized in the late 1980s. Those early works were inspired by principal component analysis (PCA), which was invented more than a century ago and has remained a popular feature representation technique in machine learning for dimension reduction and feature representation. Autoencoders provided a neural network version of PCA. Over the past two decades, autoencoders have come far ahead. Sparse encoding with feature space larger than the input was developed. Furthermore, denoising autoencoders based on corrupted input learning came forward.

For simplicity, a linear single-layer autoencoder (SLA) is compared with principal component analysis (PCA). Multiple algorithms exist for PCA modeling. One of them is estimation by minimizing the reconstruction error. This algorithm gives a clearer understanding of the similarities between a PCA and an autoencoder.

The illustration below visualizes an SLA. It shows that the encoding process is similar to the principal component transformation.

Get hands-on with 1200+ tech skills courses.