An Introduction to Entity Resolution in Python/

...

Interpretable Matching

Learn how interpretability benefits modeling, training datasets, and user experience.

We'll cover the following...

A matching model predicts if a pair of records describes the same entity. Model interpretability is about the WHYs. Why did the model predict a specific pair as a match and another as a no-match? The higher the interpretability of a model, the better we can answer these questions.

Why do we even care about interpretability? Why not just focus on building a top-performing prediction model? Here are three very different reasons:

Interpretability is a legal requirement in some use cases. Think of a model predicting payment fraud for an e-commerce shop or loan default. Modern systems rely on entity resolution to incorporate multiple loosely connected information sources, like databases of confirmed identity thefts. In many markets, consumers or applicants have the right to know why their transaction has been rejected.
A credit card company has much experience detecting fraud. Practitioners are less likely to deploy a new model they don’t trust, even if it outperforms the status quo on test sets. Model explanations increase trust by confirming or updating the current beliefs with new evidence.
An excellent test performance does not always translate to great inference on examples outside the labeled data. The test set might not represent specific scenarios well, or there might be a temporal ...

Introduction to Entity Resolution and Applications

A Quickstart Guide Using the RecordLinkage Package

Preprocessing

Indexing

Feature Engineering

Pairwise Matching

Clustering

Integration

Entity Resolution Fundamentals

Matching Products Across Two Online Shops

Conclusion

Appendix

Auto-Tagging System for Content Categorization

Interpretable Matching