Information Technology Reference
In-Depth Information
number
of_children
---
18 DS
FAMSTAND
age
sex
marital_status
purchaised
33 female
single
0
0
34 female
single
1
1
35 f emal e
married
2
1
=ledig
13 DS
KINDER
=verheir
5 DS
[ 2]
36 f emal e
married
0
1
29 female
single
0
0
30 male
single
0
0
31 male
single
1
1
<=0
11 DS
[ 1]
>0
2 DS
[ 2]
32 male
single
2
1
33 mal e
married
0
1
Fig. 5. Example Data Base and Resulting Decision Tree for Campaign Management
4.3 Knowledge Discovery
4.3.1 Deviation Detection
Real-world observation are random events. The determination of characteristic values
such as the quality of an industrial part, the influence of a medical treatment to a
patient group or the detection of visual attentive regions in images can be done based
on statistical parameter tests.
4.3.2 Cluster Analysis
A number of objects that are represented by a n-dimensional attribute vector should
be grouped into meaningful groups. Objects that get grouped into one group should be
as similar as possible. Objects from different groups should be as dissimilar as
possible. The basis for this operation is a concept of similarity that allows us to
measure the closeness of two data entries and to express the degree of their closeness.
Once groups have been found we can assign class labels to these groups and label
each data entry in our data base according to its group membership with the
corresponding class label. Then we have a data base which can serve as basis for
classification.
4.3.3 Visualization
The famous remark "A picture is worth more than thousand words." especially holds
for the exploration of large data sets. Numbers are not easy to overlook by humans.
The summarization of these data into a proper graphical representation may give
humans a better insight into the data. For example, clusters are usually numerically
represented. The dendrogram illustrates these groupings, and gives a human an
understanding of the relations between the various groups and subgroups.
A large set of rules is easier to understand when structured in a hierarchical fashion
and graphical viewed such as in form of a decision tree.
4.3.4 Association Rules
To find out associations between different types of information which seem to have
no semantic dependence, can give useful insights in e.g. customer behavior.
Search WWH ::




Custom Search