Training and Evaluating the Skip-Gram Model

Learn about the training and evaluation of the skip-gram model.

Calculating word similarities

Our training process is going to be very simple because we have defined a function to generate batches of data in the exact format the model needs them in. But before we go ahead with training the model, we need to think about how we evaluate word vector models. The idea of word vectors is that words sharing semantic similarity will have a smaller distance between them, whereas words with no similarity will be far apart. To compute the similarities between words, we can use the cosine distance. We picked a set of random word IDs and stored them in valid_term_ids during our hyperparameter discussion. We’ll implement a way to compute the closest k words to each of those terms at the end of every epoch.

Get hands-on with 1400+ tech skills courses.