ELMo: Taking Ambiguities Out of Word Vectors
Learn about the ELMo representation algorithm, and find ambiguities in word vectors.
We'll cover the following...
Limitations of vanilla word embeddings
So far, we’ve looked at word embedding algorithms that can give only a unique representation of the words in the vocabulary. However, they will give a constant representation for a given word, no matter how many times we query. Why would this be a problem? Consider the following two phrases:
I went to the bank to deposit some money
and
I walked along the river bank
Clearly, the word “bank” is used in two totally different contexts. If we use a vanilla word vector algorithm (e.g., skip-gram), we can only have one representation for the word “bank,” and it’s probably going to be muddled between the concept of a financial institution and the concept of walkable edges along a river, depending on the references to this word found in the corpus it’s trained on. Therefore, it’s more sensible to provide embeddings for a word while preserving and leveraging the context around it. This is exactly what ELMo is striving for.