How to build an LSTM model using Pytorch

Long short term memory (LSTM) is a special kind of RNNRecurrent Neural Network. They have proven to work accurately on various problems in the field, such as speech recognition, machine translation, and more. They overcome the limitations of naïve RNNs, which fail to deal with long-term dependencies in the sequences.

Implement the LSTM model in PyTorch

We can build the LSTM model with PyTorch by following these steps:

Step 1

Firstly, we import the PyTorch library into our project using the following code snippet:

import torch
import torch.nn as nn

Step 2

Next, we prepare and load the data set into the project.

Step 3

Now, we proceed to create the LSTM model and define the forward pass of the LSTM. The following code demonstrates this step:

class LSTMModel(nn.Module):
def __init__(self, input_d, hidden_d, layer_d, output_d):
super(LSTMModel, self).__init__()
self.hidden_dim = hidden_d
self.layer_dim = layer_d
# LSTM model
self.lstm = nn.LSTM(input_d, hidden_d, layer_d, batch_first=True)
# batch_first=True (batch_dim, seq_dim, feature_dim)
self.fc = nn.Linear(hidden_d, output_d)
def forward(self, x):
h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()
c0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()
out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))
out = self.fc(out[:, -1, :])
return out
input_dim = 30
hidden_dim = 120
output_dim = 15
layer_dim = 1
model = LSTMModel(input_dim, hidden_dim, layer_dim, output_dim)

Explanation

  • Line 1: We inherit nn.Module in the LSTM class.

  • Line 2: The input_d is the number of expected features in the input. The hidden_d is the number of features in the hidden state.

  • Lines 5–6: We define the number of hidden dimensions and layers is defined.

  • Line 12: We define the read-out layer using the fc() function.

  • Line 14: We define the forward function to create the forward pass for the LSTM model.

  • Line 16: We initialize the hidden state with zeros.

  • Line 18: We initialize the cell state with zeros.

  • Line 20: This step takes place thirty times. We detach as we truncate Backpropagation through timeA gradient-based technique to train certain types of RNNs. (BPTT). If we don't detach, we'll backprop to the start.

  • Lines 25–30: Variables are declared, and an LSTM model object is created.

Step 4

After the model has been instantiated, we instantiate the loss by calculating the cross entropy loss. The following code snippet demonstrates it:

error = nn.CrossEntropyLoss()

Step 5

Next, we instantiate the optimizer by using the SGD optimizer:

learning_rate = 0.1
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

Step 6

The model is trained and is used to make predictions. We'll not discuss this step in this Answer.

Full implementation

The full implementation of the LSTM model is demonstrated below:

# step 1: importing libraries
import torch
import torch.nn as nn
#step 3: creATING THE MODEL
class LSTMModel(nn.Module):
def __init__(self, input_d, hidden_d, layer_d, output_d):
super(LSTMModel, self).__init__()
self.hidden_dim = hidden_d
self.layer_dim = layer_d
# LSTM model
self.lstm = nn.LSTM(input_d, hidden_d, layer_d, batch_first=True)
self.fc = nn.Linear(hidden_d, output_d)
def forward(self, x):
h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()
c0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()
out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))
out = self.fc(out[:, -1, :])
return out
input_dim = 30
hidden_dim = 120
output_dim = 15
layer_dim = 1
model = LSTMModel(input_dim, hidden_dim, layer_dim, output_dim)
#step 4: calculating cross entropy loss
error = nn.CrossEntropyLoss()
#step 5: optimizer
learning_rate = 0.1
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

Free Resources

Copyright ©2024 Educative, Inc. All rights reserved