Implementing GloVe
Learn about the implementation of the GloVe word embedding algorithm.
We'll cover the following...
In this lesson, we’ll discuss the steps for implementing GloVe.
First, we’ll define the hyperparameters:
batch_size = 4096 # Data points in a single batchembedding_size = 128 # Dimension of the embedding vector.window_size=1 # We use a window size of 1 on either side of target wordepochs = 5 # Number of epochs to train for# We pick a random validation set to sample nearest neighborsvalid_size = 16 # Random set of words to evaluate similarity on.# We sample valid data points randomly from a large window without always being deterministicvalid_window = 250# When selecting valid examples, we select some of the most frequent words as well as# some moderately rare wordsnp.random.seed(54321)random.seed(54321)valid_term_ids = np.array(random.sample(range(valid_window), valid_size))valid_term_ids = np.append(valid_term_ids, random.sample(range(1000, 1000+valid_window), valid_size),axis=0)
The hyperparameters we define here are the same hyperparameters that we already discussed. We have a batch size, embedding size, window size, the number of epochs, and, finally, a set of held-out validation word IDs to which we’ll print the most similar words.
We’ll then define the model. First, we’ll import a few things we’ll need down the line:
import tensorflow.keras.backend as Kfrom tensorflow.keras.layers import Input, Embedding, Dot, Addfrom tensorflow.keras.models import ModelK.clear_session()
The model is going to have two input layers: word_i
and word_j
. They represent a batch of context words and a batch of target words (or a batch of positive skip-grams):
# Define two input layers for context and target wordsword_i = Input(shape=())word_j = Input(shape=())
Note how the shape is defined as an empty tuple. This means the final shape of word_i
and word_j
would be [None]
, meaning it will take a vector of an arbitrary number of elements as the input.
Next, we’re going to define the embedding layers. ...