Many Categories SSE
Build knowledge of how CART regression trees find the most optimal splits for categorical data.
We'll cover the following...
Splitting binary categorical features
The simplest case for categorical features is where there are only two possible values (i.e., the feature is binary). In the case of binary categorical features, the CART regression tree algorithm calculates the SSE for the feature by choosing one of the categories, splitting the data, calculating the SSE for the left-hand and right-hand data, and then adding the two SSEs.
Splitting many category features
When a categorical feature has three or more categories (i.e., levels), the CART regression tree ...