...

/

Embeddings and Vector Stores in LangChain

Embeddings and Vector Stores in LangChain

Discover different ways of storing documents and using them in your applications with LangChain

Let’s dive right into one of the most critical pieces of building intelligent applications with LangChain: vector stores. By now, you’ve already experimented with language models to generate content or answer questions. But how do we store and retrieve text data in a way that actually captures its meaning? That’s where vector stores shine.

What are embeddings?

First, let’s talk about embeddings, because vector stores and embeddings go hand in hand. An embedding is a numerical representation of text. If we consider your text—whether it’s a word, a sentence, or an entire document—embeddings convert it into a list of numbers (a vector) that captures the semantic meaning of that text.

An easy mental image is to think of a giant three-dimensional space (though, in practice, the space often has hundreds or thousands of dimensions). Words or sentences that are related in meaning appear “close” to each other, while unrelated text drifts farther away. For example, “kitten” would be near “cat,” but both would be quite distant from “car.”

Loading...
A sample plot for the embeddings in 3 dimensions

This is powerful because machines don’t speak English or some other language; they speak numbers. By encoding words in vectors that embed their semantic relationships, we bridge the gap between human language and the mathematical manipulations computers excel at.

LangChain offers integrations with multiple embedding providers. One standout option is OpenAI, which provides state-of-the-art models like text-embedding-3-large. Here’s a quick look at how you might use it:

Press + to interact
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")

Previously, we used Groq to access our LLM of choice; however, at the time of writing this course, no embedding model was available on Groq. We will now use OpenAI to access their library of models.

That’s all it takes to get started. This line of code loads a highly capable model that ...

Access this course and 1400+ top-rated courses and projects.