Database Reference
In-Depth Information
Chapter 5. Building a Classification Model
with Spark
In this chapter, you will learn the basics of classification models and how they can be used
in a variety of contexts. Classification generically refers to classifying things into distinct
categories or classes. In the case of a classification model, we typically wish to assign
classes based on a set of features. The features might represent variables related to an item
or object, an event or context, or some combination of these.
The simplest form of classification is when we have two classes; this is referred to as bin-
ary classification. One of the classes is usually labeled as the positive class (assigned a la-
bel of 1), while the other is labeled as the negative class (assigned a label of -1 or, some-
times, 0).
A simple example with two classes is shown in the following figure. The input features in
this case have two dimensions, and the feature values are represented on the x and y axes in
the figure.
Our task is to train a model that can classify new data points in this two-dimensional space
as either one class (red) or the other (blue).
Search WWH ::




Custom Search