Database Reference
In-Depth Information
Decision trees for regression
Just like using linear models for regression tasks involves changing the loss function used,
using decision trees for regression involves changing the measure of the node impurity
used. The impurity metric is called variance and is defined in the same way as the squared
loss for least squares linear regression.
Note
See the MLlib - Decision Tree section in the Spark documentation at ht-
tp://spark.apache.org/docs/latest/mllib-decision-tree.html for further details on the decision
tree algorithm and impurity measure for regression.
Now, we will plot a simple example of a regression problem with only one input variable
shown on the x axis and the target variable on the y axis. The linear model prediction func-
tion is shown by a red dashed line, while the decision tree prediction function is shown by a
green dashed line. We can see that the decision tree allows a more complex, nonlinear mod-
el to be fitted to the data.
Search WWH ::




Custom Search