Figure 4.1 Decision tree
As an example, we consider two characteristics of cars: AC (air condition-
ing) and ABS (automatic braking system). Both characteristics can assume
one of the two values “yes” and “no”, indicating whether the option is
Based on these two characteristics we classify the cars in three price
categories: “high”, “medium”, “low”. A car belongs to the “high” category if
it has both AC and ABS, i.e. both characteristics have the value “yes”. If both
characteristics have the value “no” then the car belongs to the “low” cate-
gory. In the other cases the car belongs to category “medium”. From the
above criteria it is possible to devise several classification criteria such as
the one shown in Figure 4.1. It is a decision tree: it is possible to follow a
path, based on the values of the characteristics, reaching a leaf that
represents the category of the car.
In our application we have an initial set of existing car models that have
been assigned to an existing category by the experts of the insurance
company. We want to build a classifier that captures the criteria used so far
and applies them to new car models. Our problem consists of two separate
issues. First of all we need a tool to automatically analyse and capture the
criteria applied by human experts to assign cars to their risk category in the
past. Second, we need a tool that can evaluate the risk category of new car
Typically there are two kinds of classifier: expert-based and example-
based. An expert-based classifier elaborates a set of rules defined by an
expert and tells how to assign a category to an item. An example-based
classifier identifies that set of rules by looking at a set of example items
which have been previously labelled with a category tag.
Since insurance companies have a large set of examples we can start from,
the obvious choice is to adopt an example-based classifier. The classifier
must capture the correlation between the characteristics of the cars (e.g.
ABS, AC, engine power).