Creating Functions

Learn to define functions in R and the best practices for using them.

Functions are an essential element of code organization. Without functions, our code quickly becomes a tangled mess of repeated lines and nested loops. But with them, we can reduce complexity, increase readability, and save valuable time by reusing code.

Like in other programming languages, we can define functions in R that take a set of inputs, process them, and return some outputs. Creating functions allows us to streamline our code by reducing repetitive sections and keeping our code modular, so we can easily recall steps that we’ve already coded. And the best part? We can create functions (and even packages!) to share with colleagues.

Let’s take a look at a code example of how to create functions in R:

Press + to interact
main.R
MPG.csv
#Load tidyverse libraries
library(ggplot2)
library(purrr)
library(tibble)
library(dplyr, warn.conflicts = FALSE)
library(tidyr)
library(stringr)
library(readr)
library(forcats)
#Load the mtcars dataset
IN_MtCars <- mpg
#Define a function to calculate average mpg for a given car model
calculate_avg_mpg <- function(IN_CarData, IN_CarModel) {
OUT_AvgMpg <- IN_CarData %>%
filter(model == IN_CarModel) %>% #filter the input data to the specified model
summarize(avg_mpg = mean(hwy)) #calculate the mean highway mpg for that model
return(OUT_AvgMpg)
}
#Call the function to calculate the average mpg for the "Merc 240D" car model
OUT_AvgMpgA4 <- calculate_avg_mpg(IN_MtCars, "a4")
#With a default value for car_model, define a function to calculate average mpg
#for a given car model
calculate_avg_mpg_a4D <- function(IN_CarData, IN_CarModel = "a4") {
OUT_AvgMpg <- IN_CarData %>%
filter(model == IN_CarModel) %>% #filter the input data to the specified model
summarize(avg_mpg = mean(hwy)) #calculate the mean highway mpg for that model
return(OUT_AvgMpg)
}
#Call the function to calculate the average mpg for the "Merc 240D" car model
OUT_AvgMpgA4D <- calculate_avg_mpg_a4D(IN_MtCars)
#Define a function to calculate average mpg for a variable number of car models
calculate_avg_mpg_mult <- function(IN_CarData, ...) {
car_models <- enquos(...) #convert the multiple car models passed in to a list
OUT_AvgMpgs <- map(car_models, ~IN_CarData %>% #for each model passed in
filter(model == !!.x) %>% #filter to the model type
summarise(avg_mpg = mean(hwy)) %>% #calculate mean highway mpg
pull(avg_mpg)) #save the result as a number
return(OUT_AvgMpgs)
}
#Call the function to calculate the average mpg for multiple car models
OUT_AvgMpgMult <- calculate_avg_mpg_mult(IN_MtCars, "a4", "corvette", "malibu")
#Print the result of the first function
paste0("The average mpg for the a4 is ", OUT_AvgMpgA4)
#Print the result of the function with a default
paste0("By default - The average mpg for the a4 is ", OUT_AvgMpgA4D)
#Print the results of the multiple function
paste0("By multiple - The average mpg for the a4 is ", OUT_AvgMpgMult[1])
paste0("By multiple - The average mpg for the corvette is ", OUT_AvgMpgMult[2])
paste0("By multiple - The average mpg for the mailbu is ", OUT_AvgMpgMult[3])

  • Lines 15–21: Define a function called calculate_avg_mpg with two input arguments, IN_CarData and IN_CarModel.

    • Lines 16–18: Calculate the output of the function by filtering the input dataset (IN_CarData) for the car model passed in IN_CarModel, and then calculate the mean highway miles per gallon (mean(hwy)) for that car model in the dataset.
    • Line 20: Return the result of the calculation above as the result of the function.
  • Line 24: Call the function, using IN_MtCars as the input dataset and a4 as the car model to be measured. Save the result in OUT_AvgMpg_a4.

  • Lines 28–34: Define a function very similar to the one defined above (lines 15–21), but now assign a default value to IN_CarModel, which will be used in case the argument is left unspecified.

  • Line 37: Call the function calculate_avg_mpg_a4D ...