Database Reference
In-Depth Information
coupling, however, refer to the coupling of data, not to the coupling of databases. 31
Matching and verification are not closely related to enhancing data mining and
are, therefore, beyond the scope of this chapter. For more on combining database
and identity resolution issues, see Chapter 10.
+
+
A
B
+
+
C
D
Fig. 2.6 Different forms of database coupling. The dotted parts are filled with data and the
blank parts are empty. A: The coupling of records; B: The coupling of attributes per record;
C: The combination of coupling records and attributes; full integration; D: The combination
of coupling records and attributes; partial integration. The horizontal length represents the
number of records and the vertical length represents the number of attributes.
2.6 Conclusion
In this chapter we introduced data mining as a technique to build models on huge
amounts of data. The need for data mining is motivated by the challenges posed
by the huge amounts of data available nowadays. Data mining offers many
different tools for the automatic analysis of data. In the chapter we discussed two
unsupervised techniques: pattern mining to find local patterns, each describing a
particular trend or regularity in the data, and clustering which aims at building a
global model of the data by dividing the dataset into clusters of homogeneous data
records. The third technique, classification, was a supervised technique as it
required the availability of records extended with a class attribute that holds the
label of the group to which the record belongs.
In the data mining community many algorithms were developed for these three
main tasks, providing governments and companies with new tools to build better
profiles and make more accurate predictions in the future, extrapolating from
information extracted from the past.
As will be discussed in the next chapters of this topic, however, these new data
mining techniques also harbour some dangers. When collecting data for data
mining, often data from many different databases needs to be coupled and
combined, often leading to privacy problems for the individuals whose personal
31 Although verification is not really a method of database coupling, it may enhance the
results of data mining, as was mentioned above.
Search WWH ::




Custom Search