Embedding Loss

Learn about different loss functions for embedding training.

Chapter Goals:

  • Learn about the different types of candidate sampling algorithms and loss functions
  • Calculate the embedding model's loss with the NCE loss function

A. Loss functions

As mentioned in the previous chapter, candidate sampling avoids performing a costly full softmax operation to calculate the embedding loss. Instead, there are two main loss functions we use: sampled softmax and NCE loss.

Sampled Softmax

As the name suggests, this is just a softmax loss with "sampled" classes. The classes we use to calculate the softmax include the actual context vocabulary word (the true label), as well as a randomly chosen set of words from the entire vocabulary to act as the false labels. In TensorFlow, we can compute the sampled softmax loss ...