Introduction to pgvector Extension in PostgreSQL
Learn about the pgvector extension in PostgreSQL, its distance functions and indexing mechanisms.
pgvector is an open-source vector similarity search extension for PostgreSQL that enables efficient storage and querying of high-dimensional vectors. It allows neighbor search on vector data, making it suitable for a variety of applications such as recommendation systems, image and text search, and clustering analysis.
By leveraging PostgreSQL's capabilities, pgvector
inherits features like ACID compliance, point-in-time recovery, JOIN
s, and scalability. Additionally, pgvector
supports multiple programming languages (Java, Python, Go, C#, etc.), allowing us to generate and store vectors in one language and query them in another. pgvector
offers both exact and approximate nearest-neighbor search algorithms, enabling users to strike a balance between accuracy and performance based on their specific requirements.
Basics
We will first need to enable the extension:
CREATE EXTENSION vector;
Create a table and insert data:
CREATE TABLE products (id bigserial PRIMARY KEY, embedding vector(3));INSERT INTO products (embedding) VALUES ('[1,2,3]'), ('[4,5,6]');
The first command creates a table named products
with columns id
as a bigserial
primary key and embedding as a vector
of size 3. The second command inserts two rows into the products
table, each containing an embedding
vector of size 3.
Query using distance functions
pgvector
provides the ...