Understanding Deep Learning Applications in Rare Event Prediction/

...

Neural Network Construction

Learn about TensorFlow’s sequential model building, from initializing layers to model fitting.

We'll cover the following...

The sequential approach
Input layer
Dense layer
Output layer
Model summary
Model compile
Model fit
Results visualization
- Loss
- f1-score, recall and fpr
Completion of the model

TensorFlow provides a simple-to-implement APIApplication Programming Interface for constructing deep learning models. There are three general approaches:

Sequential
Functional
Model subclassing

The ease of their use is in the same order. Most modeling requirements are covered by the sequential and functional approaches.

The sequential approach

Sequential is the simplest approach. In this approach, models that have a linear stack of layers and the layers communicate sequentially are constructed. Models in which layers communicate non-sequentially (for example, residual networks) cannot be modeled with a sequential approach. Functional or model subclassing is used in such cases.

Multi-layer Perceptrons (MLPs) are sequential models. Therefore, a sequential model is initialized, as shown below.

Press + to interact

The initialization here is creating a Sequential object. Sequential inherits the Model class in TensorFlow, and that way, inherits all the training and inference features.

Input layer

The model starts with an input layer. No computation is performed at this layer. Still, this plays an important role.

An input layer can be imagined as a gate to the model. The gate has a defined shape. This shape should be consistent with the input sample. This layer has two functions:

To not allow a sample to get through if its shape is not consistent
To communicate the shape of the inputted batch to the next layer

Press + to interact

The input layer is added to the model as shown above. It takes an argument shape, which is a tuple of the shape of the input. The tuple contains the length of each axis of the input sample, and the last element is empty.

Here, the input has only one axis for the features with a length N_FEATURES defined in the previous lesson. In the case of multi-axes inputs, such as images and videos, the tuple will have more elements.

The last (empty) element corresponds to the batch size. The batch size is defined during the model fit and is automatically taken by the model. The empty element in the tuple can be seen as a placeholder.

Explicitly defining the input layer is optional. In fact, it is common to define the input shape in the first computation layer. For example:

Dense(..., input_shape=(N_FEATURES, ))

The above line represents a dense layer in a neural network. The ellipsis (...) typically includes the number of neurons and the activation function. input_shape=(N_FEATURES, ) specifies the shape of the input data, where N_FEATURES is the number of features in each input.

Dense layer

A dense layer is one of the primary layers in deep learning. It’s used in MLPs and most other deep learning architectures.

Its importance can be attributed to its simplicity. A linearly activated dense layer is simply an affine transformation of the inputs.

Moreover, as opposed to most other layers, a (non)linear dense layer provides a simple structure to find a relationship between the features and response in the same space.

An MLP is a stack of dense layers. That is, from hidden to output all are dense layers.

The number of hidden layers is a model configuration. As a general principle, it’s recommended to begin with two hidden layers as a baseline.

They’re added to the model as shown below.

Press + to interact

The size of a layer is the first argument. The number of nodes (denoted as units in TensorFlow) in the layer is the same as its size.

The size is a configuration property. It should be set around half of the number of input features. As a convention the size should be taken from a geometric series of 2: a number in $\{1, 2, 4, 8, 16, 32, \ldots\}$ .

The input sample has 69 features, therefore, the first dense layer is made of size 32. This also means the input to the second layer has 32 features, and therefore, its size is set as 16.

Following these conventions is optional but helps in streamlined model construction. These conventions are made keeping in account the insensitivity of deep learning models towards minor configuration changes. Deep learning models are generally insensitive to minor changes in a layer size. Therefore, it’s easier to follow a general principle for configuring layer sizes.

Activation is the next argument. This is an important argument because the model is generally sensitive to any ill selection of activation.

Note: Appropriate choice of activation is essential because models are sensitive to them. relu activation is a good default choice for hidden layers.

The name argument, in the end, is optional. It’s added for better readability in the model summary.

Output layer

The output layer in most deep learning networks is a dense layer. This is due to the dense layer’s affine transformation property, which is usually required at the last layer. In an MLP, it’s a dense layer by design.

The output layer should be consistent with the response’s size just like the input layer must be consistent with the input sample’s size.

In a classification problem, the size of the output layer is equal to the number of classes/responses. Therefore, the output dense layer has a unit size in a binary classifier (size=1) as shown below.

Press + to interact

Getting Started

Rare Event Prediction

Multi-Layer Perceptrons (MLPs)

Long Short-Term Memory (LSTM) Networks

Convolutional Neural Networks (CNNs)

Autoencoders

Conclusion

Neural Network Construction

The sequential approach

Input layer

Dense layer

Output layer

Model summary