Prepare Data: Manage Variables
Learn to create variables of different types in R.
Managing variables or columns of a data frame often involves creating new variables, renaming variable names, recoding variables in terms of variable values, and creating variable labels. This section relies heavily on the earlier discussion on variable types.
Create new variables
To conduct data analysis to answer a research question, we often create new variables. Here, we provide some examples on how to create numeric, character, and factor variables. We also discuss how to construct leading, lagging, and growth rate variables. We also show how to compute a new variable representing group mean.
Numeric variables: Real investment per capita and total real investment
We begin with some simple examples of numeric variables. Suppose we want to use pwt7
to create two new variables on investment: real investment per capita and total real investment in a country. For this task, the relevant variables include ki
, rgdpl
, and POP
, which are defined in the readme file as follows:
- The variable
ki
is “Investment Share of PPP Converted GDP Per Capita at 2005 constant pricesrgdpl
in percent.” - The variable
rgdpl
is "PPP Converted GDP Per Capita (Laspeyres), derived from growth rates ofc
,g
,i
, at 2005 constant prices (2005 International dollar per person).’’ - The variable
POP
is “Population (in thousands).”
Therefore, real investment per capita (in 2005 international dollars) should be computed as rgdpl
∗ ki
/100, and total real investment (in 2005 international dollar) should be computed as rgdpl
∗ POP
∗ 1000 ∗ ki
/100 = rgdpl
∗ POP
∗ ki
∗ 10.
The R code for creating these two variables is as follows:
Get hands-on with 1400+ tech skills courses.