Traditional search methods rely on exact keyword matching: you enter a node’s name or label, and the system returns only results that match exactly. While this approach works for simple queries, it struggles with more complex scenarios. For example, searching for “William Shakespeare” in a knowledge graph where the node is labeled “Shakespeare” would fail unless the name matches precisely—even though both refer to the same person. Clearly, we need a more intelligent approach.
Knowledge graphs have gained attention in recent years for their role in enhancing the capabilities of large language models (LLMs). When paired with LLMs, knowledge graphs act as a structured context provider for answering questions. However, efficiently retrieving relevant nodes or entities from a knowledge graph to feed into an LLM often poses a challenge, especially in large and complex graphs.
This is where vector search comes in.
What is vector search?#
Vector search enables us to find similar entities by representing them as embeddings—numerical vectors that capture the semantic meaning of an entity based on its attributes and relationships. By comparing the embeddings of different entities, we can identify those that are most similar in meaning, even when their exact wording or structure differs.
Embeddings are generated using embedding models, which are specialized machine learning models designed to transform entities, texts, or other forms of data into vector representations.
Here’s how vector search works when used with LLMs for context retrieval: