Long short term memory (LSTM) is a special kind of
We can build the LSTM model with PyTorch by following these steps:
Firstly, we import the PyTorch library into our project using the following code snippet:
import torchimport torch.nn as nn
Next, we prepare and load the data set into the project.
Now, we proceed to create the LSTM model and define the forward pass of the LSTM. The following code demonstrates this step:
class LSTMModel(nn.Module):def __init__(self, input_d, hidden_d, layer_d, output_d):super(LSTMModel, self).__init__()self.hidden_dim = hidden_dself.layer_dim = layer_d# LSTM modelself.lstm = nn.LSTM(input_d, hidden_d, layer_d, batch_first=True)# batch_first=True (batch_dim, seq_dim, feature_dim)self.fc = nn.Linear(hidden_d, output_d)def forward(self, x):h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()c0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))out = self.fc(out[:, -1, :])return outinput_dim = 30hidden_dim = 120output_dim = 15layer_dim = 1model = LSTMModel(input_dim, hidden_dim, layer_dim, output_dim)
Line 1: We inherit nn.Module
in the LSTM class.
Line 2: The input_d
is the number of expected features in the input. The hidden_d
is the number of features in the hidden state.
Lines 5–6: We define the number of hidden dimensions and layers is defined.
Line 12: We define the read-out layer using the fc()
function.
Line 14: We define the forward
function to create the forward pass for the LSTM model.
Line 16: We initialize the hidden state with zeros
.
Line 18: We initialize the cell state with zeros
.
Line 20: This step takes place thirty times. We detach as we truncate
Lines 25–30: Variables are declared, and an LSTM model object is created.
After the model has been instantiated, we instantiate the loss by calculating the cross entropy loss. The following code snippet demonstrates it:
error = nn.CrossEntropyLoss()
Next, we instantiate the optimizer by using the SGD
optimizer:
learning_rate = 0.1optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
The model is trained and is used to make predictions. We'll not discuss this step in this Answer.
The full implementation of the LSTM model is demonstrated below:
# step 1: importing librariesimport torchimport torch.nn as nn#step 3: creATING THE MODELclass LSTMModel(nn.Module):def __init__(self, input_d, hidden_d, layer_d, output_d):super(LSTMModel, self).__init__()self.hidden_dim = hidden_dself.layer_dim = layer_d# LSTM modelself.lstm = nn.LSTM(input_d, hidden_d, layer_d, batch_first=True)self.fc = nn.Linear(hidden_d, output_d)def forward(self, x):h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()c0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))out = self.fc(out[:, -1, :])return outinput_dim = 30hidden_dim = 120output_dim = 15layer_dim = 1model = LSTMModel(input_dim, hidden_dim, layer_dim, output_dim)#step 4: calculating cross entropy losserror = nn.CrossEntropyLoss()#step 5: optimizerlearning_rate = 0.1optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
Free Resources