Biased Vectors
Learn about biases included in word vectors
We'll cover the following
Biases in word vectors
Relationships between word vectors are a crucial reason for their high demand. Embeddings created from large corpora capture the intricate relationships between words and concepts. This allows downstream models to utilize this knowledge, achieving better results. Let’s delve into how this works and explore potential pitfalls.
We’ve already noted that related words tend to produce similar vectors. Owing to this characteristic, models can generalize knowledge more effectively. Consider the following example: “Amy is happy because of her new car” is a training set instance for a sentiment classification model. If we replace “car” with “van,” the vector value for “van” will be similar to that of “car.” Thus, the model’s prediction should be reasonably consistent. As a result, there’s no need to include multiple synonyms for “car” in the training set. The model recognizes that some yet-to-be-seen words are similar and can be predicted correctly.
Get hands-on with 1400+ tech skills courses.