Data and Software Preparations

Learn about the data and software preparations required for this chapter.

We'll cover the following

For data preparation, we need to complete the following tasks:

For data preparation, we should have completed the following tasks:

  1. Set up a project folder to hold pwt7 data, program, and output files.
  2. Create a well-documented R program to import pwt7 into R.
  3. Inspect imported data.
  4. Create a new dataset using a subset of pwt7 data.
  5. Install add-on packages that are needed.
  6. Create new variables for later use.

Note: Dataset pwt7 is the same as used in the previous lessons.

Recall from the previous lessons that a folder named Project has been created to hold data, program, and output files. The R code below demonstrates how we should begin with a clean workspace, reset the working directory, import pwt7 data into R, briefly inspect imported data, create a new dataset arbitrarily named pwt7g using a subset of pwt7, install and load needed add-on packages, create the variable of interest growth, drop the observations before 1960 to control for the influence of the immediate post-WWII recovery, and save the produced dataset as an R dataset.

We inspect the dataset in R and create a new dataset named pwt7g for analysis.For software preparation for this lesson, we need to install the following add-on packages in R first: DataCombine, ggplot2, dplyr, broom, and gridExtra. We can install them and simply load them later using the library() function.

Note: In the previous lesson, a best practice in managing the workflow in data analysis is to use separate program files for data preparation and analysis. So, we save the R code below on data preparation as one program file and then the R code for analysis as another program file.

Get hands-on with 1400+ tech skills courses.