Databases Reference
In-Depth Information
instances, it provides interesting alternatives to the variety of decision trees
proposed in the literature.
The rest of this paper is organized as follows. In Sects. 2 and 3, we present
the data format and rule form of the classification problem respectively. In
particular, we propose fuzzy data table for data representation, and use fuzzy
decision logic as the rule representation language. In Sect. 4, we introduce a
uniform framework, called general fuzzy decision trees . The edges of a general
fuzzy decision tree are labeled by fuzzy decision logic formulas and the nodes
are split according to the satisfaction of these formulas in the data records (or
objects). We also present a construction algorithm for general fuzzy decision
trees. In Sect. 5, we show the application of our framework to different types
of training data by instantiating it to some specific cases. In particular, the
bipolar interpretation of general fuzzy decision trees results in ordinary fuzzy
decision trees [6] and multi-valued decision trees [2]. Finally, in Sect. 6, we
briefly conclude this paper and indicate some further research directions.
2 Data Representation
A data table is normally used as means of storing data. A formal definition
of a data table is given in [12].
Definition 1. A data table 1 is a pair S =( U,A ) such that
U =
{
x 1 ,x 2 ,
···
,x n }
is a nonempty finite set, called the universe
A =
{
f 1 ,f 2 ,
···
,f m }
is a nonempty finite set of primitive attributes
For 1
V i is a total function, where V i is the set of
values for f i , called the domain of values of f i .
i
m, f i : U
To distinguish data tables from fuzzy data tables, we call them precise data
tables. Hereafter, when we mention a data table S =( U,A ), we assume that
the cardinalities of U and A are respectively n and m , f i denotes the i th
attribute in A ,and V i is its domain of values. Each element in U represents a
data record. Since each data record describes the attributes of an object, we
identify a data record with the object described by the data record. Thus, the
elements of U are also called objects. In the following presentation, we treat
the terms “data records” and “objects” interchangeably.
In a precise data table, it is assumed that f i ( x ) is exactly known for each
object x and attribute f i . However, in some practical situations, we have only
incomplete information about f i ( x )forsome f i and x . To accommodate such
situations, incomplete information systems have been proposed [8-10, 16, 17].
Furthermore, many practical data mining problems need to deal with multi-
valued data [2].
1 Also called knowledge representation system, information system, or attribute-
value system.
Search WWH ::




Custom Search