Binary Classification in Entity Resolution
Get an overview of binary classification in entity resolution.
We must decide for every pair of records if they belong to the same real-world entity. That’s a binary classification problem with classes “match” and “no-match.” However, the typical real-world entity resolution task is not as standard as typical classification textbook examples for different reasons.
A huge number of pairs growing quadratically with the record sample size. Most of them are trivial to classify.
A heavy class imbalance, typically with less than 0.1% actual matches.
Very few available labels (if any).
Let’s discuss some challenges and opportunities when dealing with binary classification for entity resolution.
Class imbalance and performance evaluation
Let
The restaurants
dataset consists of
Get hands-on with 1400+ tech skills courses.