Project Creation: Part Three
Explore how to build and train a stacked LSTM model for an emoji predictor using transfer learning and GloVe embeddings. Learn how to structure LSTM layers, add dropout to reduce overfitting, and perform multi-class classification with softmax activation. Understand the benefits of transfer learning for small datasets and gain hands-on experience to complete this advanced NLP project.
With the work from the previous lessons, we finally have our input data in the format that our model can accept. We will now build our model architecture. We are going to use a stacked LSTM based model.
What is a "Stacked LSTM" based model?
It is nothing different. As the name suggests, it means that we have more than one LSTM layer in our model. The first LSTM layer accepts the input data and generate an output. This output is then fed to another LSTM layer which generates some other output. This process goes on and depends on how many LSTM layers you created in your model. Finally, the top most LSTM layer will feed its output to a densely connected layer to produce the final predictions. Look at the illustration below to get a better understanding. We have assumed the model to have two LSTM stacked layers.
Explanation:
- The input is denoted by the yellow nodes.
- We considered a three time unrolling of the
LSTMnetwork (for