Controlling Complexity

Learn to control the complexity of CART decision trees via hyperparameters.

The CART hyperparameters in R

The classification and regression tree (CART) algorithms are available in R via the rpart package. CART trees can be specified in tidymodels by using the value rpart with the set_engine() function.

The rpart package supports many hyperparameters for controlling the complexity (i.e., tuning) of decision tree models as they are being built. Of these hyperparameters, the following are the most useful in practice:

  • minsplit: The minimum number of observations that must exist in a node for a split to be attempted

  • minbucket: The minimum number of ...