Filtering Datasets

Learn to analyze data subsets using filters in the tidyverse.

In data science work, we often need to filter or subset data. Frequently, we’ll want to analyze a subset of the data given to us based on some condition that we can check within the dataset itself. For example, in a student dataset, we want to look at average grades for students only in a particular year or a specific course. Then we’ll need to filter the data to view the relevant records only.

Using filter

Filters in the tidyverse are applied similarly to group_by statements. In the example below, we use filter to subset student grade data contained in the attached csv files. The file StudentInformation.csv contains general information regarding students, while the file GradeData-byCourse.csv contains the students’ grades (Grade) for each course ( ...