Databases Reference
In-Depth Information
Figure 12.13
Learning a model for the normal class.
C
1
C
a
Objects with label “normal”
Objects with label “outlier”
Objects without label
Figure 12.14
Detecting outliers by semi-supervised learning.
class is regarded as normal. To detect outlier cases,
AllElectronics
can learn a model for
each normal class. To determine whether a case is an outlier, we can run each model on
the case. If the case does not fit any of the models, then it is declared an outlier.
Classification-based methods and clustering-based methods can be combined to
detect outliers in a semi-supervised learning way.
Example 12.20
Outlier detection by semi-supervised learning.
Consider Figure 12.14, where objects
are labeled as either “normal” or “outlier,” or have no label at all. Using a clustering-
based approach, we find a large cluster,
C
, and a small cluster,
C
1
. Because some objects
in
C
carry the label “normal,” we can treat all objects in this cluster (including those
without labels) as normal objects. We use the one-class model of this cluster to identify
normal objects in outlier detection. Similarly, because some objects in cluster
C
1
carry
the label “outlier,” we declare all objects in
C
1
as outliers. Any object that does not fall
into the model for
C
(e.g.,
a
) is considered an outlier as well.