Information Technology Reference
In-Depth Information
O k is
orders C k and a dissimilarity measure between orders d ( O a ,O b ), a central order
defined as the order that minimizes the sum of dissimilarities:
O k =argmi O
d ( O, O i ) .
(15.5)
O i
C k
O k consists of all the objects in C k , i.e., X C k =
Note that the order
O i ∈C k X ( O i ).
The dissimilarity d ( O k ,O i ) is calculated over common objects as in Section 15.3.1.
However, because X i
X ( O k ), the dissimilarity can always be calculated over X i .
Unfortunately, the optimal central order is not tractable except for a special cases. For
example, if using a Kendall distance, the derivation of central orders is NP-hard even if
all sample orders are complete [10].
Therefore, many approximation methods have been developed. However, to use as a
sub-routine in a k -o'means algorithm, the following two constraints must be satisfied.
First, the method must deal with incomplete orders that consist of objects randomly
sampled from X . In [12], they proposed a method to derive a central order of top k
lists, which are special kinds of incomplete orders. Top k list is an order that consists
of the most preferred k objects, and the objects that are not among the top k list are
implicitly ranked lower than these k objects. That is to say, the top k objects of a hidden
complete order are observed. In our case, objects are randomly sampled, and such a
restriction is not allowed. Second, the method should be executed without using iterative
optimization techniques. Since central orders are derived K times in each loop of the
k -o'means algorithm, the derivation method of central orders would seriously affect
efficiency if it adopts the iterative optimization.
To our knowledge, the method satisfying these two constraints is the following one
to derive the minimum square error solution under a generative model of Thurstone's
law of comparative judgment [13]. Because we used this method to derive central or-
ders, we call this clustering algorithm by the k -o'means-TMSE (Thurstone Minimum
Square Error) algorithm. We describe this method for deriving central orders. First, the
probability Pr[ x a
x b ] is estimated. The pair set of Pair( C k ) in Section 15.2 is gen-
erated from C k instep3of k -o'means-TMSE. Next, we calculate the probabilities for
every pair of objects in C k :
|
x a
x b |
+0 . 5
Pr[ x a
x b ]=
+1 ,
|
x a
x b |
+
|
x b
x a |
where
x b ,inthePair( C k ).These
probabilities are applied to a model of Thurstone's law of comparative judgment. This
model assumes that scores are assigned to each object x l , and an order is derived by
sorting according to these scores. Scores follow a normal distribution; i.e., N ( μ l ),
where μ l is the mean score of the object x l ,and σ is a common constant standard
deviation. Based on this model, the probability that object x a precedes the x b is
|
x a
x b |
is the number of the object pairs, x a
x b ]=
−∞
)
t
φ ( t
μ a
σ
φ ( u
μ b
Pr[ x a
) du dt
σ
−∞
= Φ μ a
,
μ b
2 σ
(15.6)
 
Search WWH ::




Custom Search