Delete and Add Data Using R

Learn why, when, and how to add or delete data to or from data frames.

Delete data from data frames

There are several reasons why we might delete data during data analysis. One reason is to remove errors or inconsistencies in the data. For example, if there are typos or other mistakes in the data, it may be necessary to delete those rows or cells to examine the data. Another reason to delete data is to remove outliers or extreme values that could skew the results.

Additionally, we might need to delete data that’s irrelevant to the research question or analysis we are conducting. This way, we can ensure that our work is focused and reliable by deleting unnecessary or inaccurate data.

Delete data by selection and deselection

The simplest way of deleting data is to deselect the unwanted parts of data and overwrite the variable name with the newly selected data.

By doing so, we can save space in the memory by not creating an extra variable and only reducing the size of the existing one.

You can practice this using the code container below. Here are some examples:

Press + to interact
print('-------- Preview of the dataset: -------------')
print(head(CO2,5)) # Print the first 5 rows of the co2 dataset
print('-------- Dimensions of the dataset: -------------')
print(dim(CO2)) # Check the dimensions of the dataset
# Select only the columns named Plant, Type and conc
CO2 <- CO2[,c('Plant','Type','conc')]
print('--------------Preview of the dataset after column selection: -----')
print(head(CO2),5) # Preview of the dataset after column selection
print('The dimensions of the dataset:')
print(dim(CO2)) # Row number and column number

In the code block above, we only select the columns ...