Controlling Complexity
Learn to control the complexity of CART decision trees via hyperparameters.
We'll cover the following...
The CART hyperparameters in R
The classification and regression tree (CART) algorithms are available in R via the rpart
package. CART trees can be specified in tidymodels
by using the value rpart
with the set_engine()
function.
The rpart
package supports many hyperparameters for controlling the complexity (i.e., tuning) of decision tree models as they are being built. Of these hyperparameters, the following are the most useful in practice:
minsplit
: The minimum number of observations that must exist in a node for a split to be attemptedminbucket
: The minimum number of ...