Search⌘ K
AI Features

Controlling Complexity

Explore how to control the complexity of CART decision trees by tuning hyperparameters such as minsplit and minbucket in R. Understand the impact on model size, training accuracy, and overfitting risks to effectively optimize tree-based machine learning models.

The CART hyperparameters in R

The classification and regression tree (CART) algorithms are available in R via the rpart package. CART trees can be specified in tidymodels by using the value rpart with the set_engine() function.

The rpart package supports many hyperparameters for controlling the complexity (i.e., tuning) of decision tree models as they are being built. Of these hyperparameters, the following are the most useful in practice:

  • minsplit: The minimum number of observations that must exist in a node for a split to be attempted

  • minbucket: The minimum number of ...