Two Numerical Explanatory Variables

Learn about two numerical explanatory variables in multiple regression.

We'll cover the following

Let’s now consider multiple regression models where, instead of one numerical and one categorical explanatory variable, we have two numerical explanatory variables. The dataset we’ll use is from the textbook, An Introduction to Statistical Learning with Applications in R (James et al., 2017). Its accompanying ISLR R package contains the datasets to which the authors apply various machine-learning methods.

One frequently used dataset in this course is the Credit dataset, where the outcome variable of interest is the credit card debt of 400 individuals. Other variables like income, credit limit, credit rating, and age are included as well. Note that the Credit data isn’t based on real individuals’ financial information, but rather is a simulated dataset used for educational purposes.

In this lesson, we’ll fit a regression model where we have:

  • A numerical outcome variable y, the cardholder’s credit card debt

  • Two explanatory variables:

    • One numerical explanatory variable x1, which is the cardholder’s credit limit

    • Another numerical explanatory variable x2, which is the cardholder’s income (in thousands of dollars)

Exploratory data analysis

Let’s load the Credit dataset. To keep things simple, let’s select() the subset of the variables we’ll consider in this lesson and save this data in the new data frame credit_ch6. Notice our slightly different use of the select() verb here. For example, we’ll select the Balance variable from Credit but then save it with a new variable name debt. We do this because here the term “debt” is easier to interpret than “balance.”

Get hands-on with 1300+ tech skills courses.