Save Duplicate Observations Into a Different Dataset
Learn how to save duplicate observations into a different dataset.
We'll cover the following
How to save duplicate observations into a different dataset
Earlier we learned how to remove duplicate observations. Often it is necessary for us to know why duplicate observations exist and whether the observations that are duplicates according to the sorting variables also have duplicate values for other variables in the dataset. Hence, we often would like to send the duplicate observations to a separate dataset for examination. The following R code shows how to do that in two different ways.
# create a dataset of duplicated observations
pwt7.d <- pwt7[duplicated(pwt7[, c("isocode", "year")]), ]
An alternative way for inspecting duplicate observations is to assign a logical value TRUE
or FALSE
to each observation in the original dataset with TRUE
indicating an observation has duplicated values for sorting variables, and assign the output to a new dataset. Then we can apply the View()
function to directly view which observations are duplicates, and apply the table()
function to get a frequency count of the number of duplicate observations in the dataset. The R code is listed below.
Get hands-on with 1400+ tech skills courses.