Building a Classification Model with Spark - Machine Learning with Spark

Database Reference

In-Depth Information

Chapter 5. Building a Classification Model

with Spark

In this chapter, you will learn the basics of classification models and how they can be used

in a variety of contexts. Classification generically refers to classifying things into distinct

categories or classes. In the case of a classification model, we typically wish to assign

classes based on a set of features. The features might represent variables related to an item

or object, an event or context, or some combination of these.

The simplest form of classification is when we have two classes; this is referred to as bin-

ary classification. One of the classes is usually labeled as the positive class (assigned a la-

bel of 1), while the other is labeled as the negative class (assigned a label of -1 or, some-

times, 0).

A simple example with two classes is shown in the following figure. The input features in

this case have two dimensions, and the feature values are represented on the x and y axes in

the figure.

Our task is to train a model that can classify new data points in this two-dimensional space

as either one class (red) or the other (blue).

Search WWH ::

Custom Search

Home