Candidate Sampling
Understand why candidate sampling is used for embedding training.
We'll cover the following...
Chapter Goals:
- Learn about candidate sampling and why it is useful for embedding training
A. Large vocabularies
To obtain good word embeddings, it is usually necessary to train an embedding model on a large amount of text data. This means that the vocabulary size will likely be very large, often reaching tens of thousands of words. However, having a large vocabulary size can significantly slow down training.
Training an embedding model is ...