Numeric Feature Impurity
Learn how CART decision trees handle numeric features with many distinct values.
We'll cover the following
Handling numeric features
The CART classification tree algorithm efficiently handles categorical features with many levels. Consider a numeric predictive feature. The age
feature of the Adult Census Income dataset is a perfect example.
The age
feature contains many values. Observations might have ages 31
, 37
, 42
, 58
, or 63
measured in years. Note that fractional components are also possible. For example, he age
feature can have the value of 31.3
years.
Not surprisingly, numeric features present the CART classification tree algorithm with many possible split points. Just like for categorical features, the algorithm designers have implemented an optimization that efficiently handles numeric features.
A numeric feature example
The following sample of Adult Census Income data can be used to train a tree using R:
Get hands-on with 1400+ tech skills courses.