Embedding-Based Similarity

Learn how to measure the similarity of long, unstructured texts with embeddings and the cosine similarity.

Most entity resolution examples use edit-based similarity functions. They work well for short texts, such as names, addresses, and phone numbers. They usually are not a good fit for long, unstructured texts—for example, product descriptions in e-commerce.

Note: The Abt-Buy dataset we use in this lesson is open data. See the Glossary of this course for attribution and references.

Get hands-on with 1200+ tech skills courses.