Joins and Lookups

Learn to combine data coming from multiple tibbles in the tidyverse using different types of joins.

One of the most common data-cleaning operations that data scientists face is combining multiple data sources. Often in data science, we receive separate datasets that need to be combined. The datasets might come from multiple files, database tables, or even distinct databases altogether, but the need to combine data in some form is a common one.

When we join two datasets, we augment one or both input tables with additional columns or rows from the other. We can add columns or rows, depending on the type of join performed. This is achieved by leveraging a common key column across both input tables. When the key columns match, we augment the matching records with additional data from the other table to form a final combined table.

Get hands-on with 1200+ tech skills courses.