Information Technology Reference
In-Depth Information
incompleteness in orders are processed based on a theory of order statistics. Addition-
ally, we propose several methods for interpreting the clusters of orders.
We formalize this clustering task in Section 15.2. Our previous and new cluster-
ing methods are presented in Section 15.3. The experimental results are shown in
Sections 15.4 and 15.5. Section 15.6 summarizes our conclusions.
15.2
Clustering Orders
In this section, we formalize the task of clustering orders. We start by defining our ba-
sic notations regarding orders. An object, entity, or substance to be sorted is denoted
by x j . The universal object set, X , consists of all possible objects, and L is defined
as
X |
x b . Note that subscript j
of x doesn't mean “The j -th object in this order,” but that “The object is uniquely in-
dexed by j in X .” The order x 1
|
. The order is denoted by O = x a ···
x j ···
x 2 represents “ x 1 precedes x 2 .” An object set X ( O i )
or simply X i is composed of all objects in the order O i . The length of O i ,i.e.,
,
is shortly denoted by L i . An order of all objects, i.e., O i s . t .X ( O i )= X , is called
a complete order; otherwise, the order is incomplete. Rank, r ( O i ,x j ) or simply r ij ,
is the cardinal number that indicates the position of the object x j in the order O i .For
example, for O i = x 1
|
X i |
x 2 , r ( O i ,x 2 ) or r i 2 is 3. Two orders, O 1 and O 2 , are concor-
dant if ordinal relations are consistent between any object pairs commonly contained in
these two orders; otherwise, they are discordant. Formally, for two orders, O 1 and O 2 ,
consider an object pair x a and x b such that x a ,x b
x 3
= x b . We say that the
orders O 1 and O 2 are concordant w.r.t. x a and x b if the two objects are placed in the
same order, i.e., ( r 1 a
X 1
X 2 ,x a
0; otherwise, they are discordant. Further,
O 1 and O 2 are concordant if O 1 and O 2 are concordant w.r.t. all object pairs such that
x a ,x b
r 1 b )( r 2 a
r 2 b )
= x b .
A pair set Pair( O i ) is composed of all the object pairs x a
X 1
X 2 ,x a
x b , such that x a pre-
cedes x b in the order O i . For example, from the order O 1 = x 3
x 2
x 1 , three object
pairs, x 3
x 1 , are extracted. For a set of orders S ,thePair( S ) is
composed of all pairs in Pair( O i ) of O i
x 2 , x 3
x 1 ,and x 2
S . Note that if the same object pairs are
contained in numbers of Pair( O i ), these pairs are multiply added into the Pair( S ).For
example, if the same object pairs x 1
x 2 are extracted from O 5 and O 7 in S , both two
ordered pairs x 1
x 2 are multiply included in Pair( S ).
The task of clustering orders is as follows. A set of sample orders, S =
{
O 1 ,O 2 ,...,
O N }
=
j . In addition, O i and O j can be discordant. The aim of clustering is to divide the
S into a partition. The partition, π =
,N
≡|
S
|
, is given. Note that sample orders may be incomplete, i.e., X i
= X j ,i
{
C 1 ,C 2 ,...,C K }
,K =
|
π
|
,isasetofall
clusters. Clusters are mutually disjoint and exhaustive, i.e., C k
C l =
,
k, l, k
= l
and S = C 1
C K . Partitions are generated such that orders in the same cluster
are similar (internal cohesion), and those in different clusters are dissimilar (external
isolation).
C 2 ∪···∪
15.2.1 Similarity between Two Orders
Clusters are defined as a collection of similar orders; thus, the similarity measures be-
tween two orders are required. Spearman's ρ [9, 4] is one such measure, signifying the
Search WWH ::




Custom Search