Classifier - Software Development Case Studies in Java

Java Reference

In-Depth Information

AC

yes

no

ABS

yes

no

yes

no

High

Medium

Low

Figure 4.1 Decision tree

4.1.1

Classification criteria

As an example, we consider two characteristics of cars: AC (air condition-

ing) and ABS (automatic braking system). Both characteristics can assume

one of the two values “yes” and “no”, indicating whether the option is

present.

Based on these two characteristics we classify the cars in three price

categories: “high”, “medium”, “low”. A car belongs to the “high” category if

it has both AC and ABS, i.e. both characteristics have the value “yes”. If both

characteristics have the value “no” then the car belongs to the “low” cate-

gory. In the other cases the car belongs to category “medium”. From the

above criteria it is possible to devise several classification criteria such as

the one shown in Figure 4.1. It is a decision tree: it is possible to follow a

path, based on the values of the characteristics, reaching a leaf that

represents the category of the car.

4.2

■

Problem analysis

In our application we have an initial set of existing car models that have

been assigned to an existing category by the experts of the insurance

company. We want to build a classifier that captures the criteria used so far

and applies them to new car models. Our problem consists of two separate

issues. First of all we need a tool to automatically analyse and capture the

criteria applied by human experts to assign cars to their risk category in the

past. Second, we need a tool that can evaluate the risk category of new car

models.

Typically there are two kinds of classifier: expert-based and example-

based. An expert-based classifier elaborates a set of rules defined by an

expert and tells how to assign a category to an item. An example-based

classifier identifies that set of rules by looking at a set of example items

which have been previously labelled with a category tag.

Since insurance companies have a large set of examples we can start from,

the obvious choice is to adopt an example-based classifier. The classifier

must capture the correlation between the characteristics of the cars (e.g.

ABS, AC, engine power).

Software Development Case Studies in Java

Search WWH ::

Custom Search

Home