Factors
Here we are going to learn about R factors: how to create them and where they are used.
We'll cover the following...
A Factor is an interesting data structure in R language used to categorize data. By categorizing data, we mean fields that take only predefined, a limited, or finite number of values (categorical variables).
For example, the marital status of a person can be one of the following:
- Single
- Married
- Separated
- Divorced
- Widowed
Here we know that the possible values for marital status are . These values are predefined and distinct and are called levels.
Creating Factors
Factors can be created using the factor() function. This function takes all the levels in the form of a vector. Let’s dive right into the code.
We can check whether a variable is a factor or not by the function is.factor().
Factors are closely related to vectors, i.e., factors are stored as integer vectors.
R recodes the data in the vector as integers and stores the result in an integer vector.
We can test this using the typeof() function.
Accessing and Manipulating Factors
Factors are accessed and manipulated the same way vectors are.