Ready-to-Use Datasets in R
Learn about the details of the dummy datasets in R and how to use them.
We'll cover the following
Why use dummy datasets?
Analysts often require the ability to experiment with different approaches to identify the most effective solutions for their projects. To facilitate this, we can utilize ready-to-use datasets. These datasets allow us to practice with various data types without investing significant time and effort into data collection.
Additionally, these datasets are beneficial for new learners who are seeking to gain experience using syntax.
The R programming language offers a range of preexisting data frames through its libraries, and some of these libraries are specifically designed for this purpose. One of them is the datasets
library.
Accessing the dummy datasets
The exercises in this course will primarily utilize dummy datasets from the datasets
library. We can easily access the dummy data frames by directly typing their names. The variable names for some datasets are listed below. Take a look at them to build familiarity.
iris
mtcars
PlantGrowth
pressure
sleep
quakes
rock
attenu
cars
CO2
Indometh
mdeaths
Orange
faithful
nottem
beaver1
Let’s practice what we’ve learned so far in the following code block. Feel free to play with the code and explore the datasets mentioned above.
print('mtcars data:')print(head(mtcars,5)) # first 5 rows of mtcars data.print('----------------------------------------------------------------')print('iris data:')print(head(iris,5)) # first 5 rows of iris data.print('----------------------------------------------------------------')# Accidentally assigned another object to the name of a dummy variableiris <- c(1:3)print(iris) # Print the new object named irisprint('----------------------------------------------------------------')remove(iris) # Remove the variable nameprint(head(iris,5)) # The original dummy dataset returns
Please avoid assigning other objects to the names of datasets because this may lead to a situation where the variable needs to be removed. In case we need to do this, we can use the remove()
command, as shown in line 14 above.
Deleting the datasets themselves is impossible since they are a part of the default R programming language. Removing the variable names only resets the data frame.