Using spaCy's Pre-trained Vectors

Let's learn how we can effectively use the spaCy pre-trained vectors.

We installed a medium-sized English spaCy language model previously so that we can directly use word vectors. Word vectors are part of many spaCy language models. For instance, the en_core_web_md model ships with 300-dimensional vectors for 20,000 words, while the en_core_web_lg model ships with 300-dimensional vectors with a 685,000 word vocabulary.

Typically, small models (those whose names end with sm) do not include any word vectors but include context-sensitive tensors. You can still make the following semantic similarity calculations, but the results won't be as accurate as word vector computations.

We can reach a word's vector via the token.vector method. Let's look at this method in an example. The following code queries the word vector for banana:

Get hands-on with 1400+ tech skills courses.