. . .
Characterization of data used for regression.
known numerical values, such as house value . In this house pricing
example, the predictors may include attributes for house square foot-
age, number of bedrooms, number of bathrooms, land area, and
proximity to school.
Also like classification, regression produces a functional rela-
tionship between the predictor attributes and the target attribute,
Y = f ( X 2 , . . ., X m ). When getting a prediction from a regression
model, some models may return only the numerical prediction, for
example, a specific predicted house value such as $976,338. Others
may also be able to return a confidence band surrounding this
value. For example, the model may provide a confidence of
$15,478, which means the prediction for the house price is most
likely correct between the range $960,860 and $991,816.
Determining the quality of regression models is based on compar-
ing the size of the difference between the actual target value and the
predicted value. Since predictions are continuous, it is highly unlikely
the model will predict a target value exactly, unlike classification
models that have few discrete values. As such, there are metrics that
assess the overall error the model makes when predicting a set of val-
ues. Chapter 7 explores specific metrics used to assess regression
Algorithms that can support regression in JDM include support
vector machine, neural networks, and decision trees. Other popular
regression algorithms are linear regression and generalized linear models
(GLM) [StatSci-GLM 2006].