Dataset in Document Classification with ELMo
Learn to classify documents with ELMo, where ELMo generates document embeddings used as inputs to a classification model.
We'll cover the following
Although Word2vec gives a very elegant way of learning numerical representations of words, learning word representations alone is not convincing enough to realize the power of word vectors in real-world applications.
Word embeddings are used as the feature representation of words for many tasks, such as image caption generation and machine translation. However, these tasks involve combining different learning models, such as CNNs and LSTM models or two LSTM models. To understand the real-world usage of word embeddings, let’s stick to a simpler task—document classification.
Document classification
Document classification is one of the most popular tasks in NLP. Document classification is extremely useful for anyone who is handling massive collections of data, such as those for news websites, publishers, and universities. Therefore, it’s interesting to see how learning word vectors can be adapted to a real-world task such as document classification by means of embedding entire documents instead of words.
Get hands-on with 1400+ tech skills courses.