Databases Reference
In-Depth Information
⊆K×K
is a transitive relation, called
a generalization relation
;
G
=
∅
is the set of partial functions
G
:
K−→K
.
Elements of
G
are called
generalizations operators
.
We define all components of the model in the following Sects. 2.1-2.4.
2.1 Knowledge Generalization System
The
knowledge generalization system
is an extension of the notion of an infor-
mation system. The information system was introduced in [9] as a database
model. The information system represents the relational table with key at-
tribute acting as object attribute and is defined as follows.
Definition 2.
Pawlak's
Information System
is a system I
=(
U,A,V
A
,f
)
,
where U
=
∅
is called a set of
objects
, A
=
∅
, V
A
=
∅
are called the set of
attributes
and
values
of attributes, respectively, f is called an
information
function and f
:
U
V
A
Any Data Mining process starts with a certain initial set of data. The
model of such a process depends on representation of this data, i.e. it starts
with an initial information system
×
A
−→
I
0
=(
U
0
,A
0
,V
A
0
,f
0
)
and we adopt the set
U
0
as the universe of the model, i.e.
G
M
=(
U
0
,
K
,
G
,
)
.
In preprocessing stage of data mining process we might perform the fol-
lowing standard operations:
1. Eliminate some attributes, apply concept hierarchy. etc.. obtaining as re-
sult the information system
I
with the set of attributes
A
A
0
2. Perform some operations on values of attributes: normalization, clustering,
etc ..., obtaining some set
V
A
of values of attributes that is similar, or
equivalent to
V
0
.Wedenoteitby
V
A
∼
⊂
V
0
V
A
and a corresponding attribute
v
a
∈
Given an attribute value
v
a
∈
V
0
(for
example
v
a
being a normalized form of
v
a
or
v
a
being a more general form as
defined by concept hierarchy of
v
a
) we denote this correspondence by
v
a
∼
v
a
.
We call any information system
I
obtained by any of the above operation
a subsystem of I
0
. We put it formally in the following definition.