Database Reference
In-Depth Information
For classification tasks, there are two measures that can be used to select the best split.
These are Gini impurity and entropy.
Note
See the MLlib - Decision Tree section in the Spark Programming Guide at ht-
tp://spark.apache.org/docs/latest/mllib-decision-tree.html for further details on the de-
cision tree algorithm and impurity measures for classification.
In the following screenshot, we have plotted the decision boundary for the decision tree
model, as we did for the other models earlier. We can see that the decision tree is able to
fit complex, nonlinear models.
Decision function for a decision tree for binary classification
Search WWH ::




Custom Search