These hidden layers are where the computation happens. Without hidden layers, a neural network will simply return output data identical to the input data.
Hidden layers are what truly enable deep learning.
Common types of neural networks#
While there are various types of neural networks, the most common are:
-
Convolutional neural networks: These are commonly used to analyze images, and are the masterminds behind image and facial recognition.
-
Recurrent neural networks (RNNs): These learn from sequential training data. They power speech-recognition apps. One type of RNN is the long short-term memory (LSTM) network, which is the type of neural network behind Google Translate.
-
Multilayer Perceptrons (MLPs): They are a foundational type of feedforward artificial neural network (ANN). As the simplest form of deep neural networks, MLPs consist of multiple fully connected layers. They offer a solution to the high computational demands of contemporary deep learning models by employing layers that function as nonlinear transformations of weighted sums from their preceding layers.
Neural networks can use different machine learning paradigms, including supervised learning, unsupervised learning, and reinforcement learning.
How neural networks work#
A neural network can receive unstructured data sets, classify data points, recognize patterns, and develop an internal representation through which it makes predictions about similar data sets.
Like humans, a neural network learns to perfect its craft over time. It goes through several iterations of computations and adjustments until it makes predictions to a reasonable accuracy.
Some of the key computational components in neural networks include:
- Activation functions: Each perceptron has an activation function that standardizes its output and prevents different units from collapsing. A common activation function is the sigmoid activation function. Other activation functions are the rectified linear unit (ReLU), leaky ReLU, and tanh.
- Weight: A value assigned to connections between perceptrons, estimated by the learning algorithm.
A neural network’s training process looks like this:
- Receives input data: Input data is received through the input layer and passed on to hidden layer(s)
- Generates outputs: The neural network usually does its initial computations by using random numbers as weight assignments
- Compares outputs: The error between the generated output and required output is represented through a loss function.
- Optimizes: An optimization algorithm is used to reduce the loss, an iterative process that repeats until the loss is minimized to a reasonably small value.
Our goal when training neural networks is to reduce the error or loss, which means that the network’s generated outputs will ideally match the required outputs. There are several types of loss functions, a common one being the cross-entropy loss function, which is typical in classification tasks.
To reduce the loss, we update the weights. At this stage, we don’t use random numbers as our weight assignments. Instead, we use optimization algorithms to determine the changes we need to make.
There are many optimization algorithms used to train neural networks. A popular one is the gradient descent algorithm. Gradient descent is an iterative optimization algorithm.
A commonly used variant of gradient descent is stochastic gradient descent, which is well suited for working with large data sets.