Database Reference
In-Depth Information
The intuitive interpretation of the centroid update rule is that if a must-link
constraint is violated, the cluster centroid is moved towards the other cluster
containing the other instance. Similarly, the interpretation of the update
rule for a cannot-link constraint violation is that cluster centroid containing
both constrained instances should be moved to the nearest cluster centroid so
that one of the instances eventually gets assigned to it, thereby satisfying the
constraint.
7.4.3 LCVQE: An Extension to CVQE
Pelleg and Baras (39) create a variation of the assignment and update rules
for CVQE that they term LCVQE. Though their algorithm was not derived to
minimize a particular objective function, it was shown to improve performance
on several standard datasets both in terms of accuracy and run-time. The two
main extensions made by this algorithm over CVQE are: a) not computing
all possible k 2 assignments but only a subset of reasonable assignments and
b) Changing the penalty for a cannot-link constraint to be the distance from
the most outlying (with respect to the cluster centroid) instance in the CL
constraint to the cluster centroid nearest it.
The assignment step shown in Equation 7.9 and the centroid update rule
are shown in Equation 7.10.
argmin j D ( x i j ) 2
x i
/
C =
C = :
(7.9)
( x a ,x b )
C = : argmin [ i = g ( x a ) ,j = g ( x b )] , [ i = g ( x a ) ,j = i, [ i = j,j = g ( x b )]
D ( x a i ) 2 + D ( x b j ) 2 +
D ( μ j i ) 2
¬
δ ( a, b )
( x a ,x b )
C = : argmin [ i = g ( x a ) ,j = g ( x b )] , [ i = g ( x a ) ,j = i.D ( x a ,g ( x a )) <D ( x b ,g ( x b ))]
D ( x a i ) 2 + D ( x b j ) 2 + δ ( a, b )
D ( μ j g ( x b )) 2
μ j =
(7.10)
x i ∈π j [ x i + ( x i ,x a ) ∈C = ,
g ( x i )= g ( x a )
μ g ( x a ) +
μ g ( x a ) ]
,
g ( x i )= g ( x a ) ,D ( x i ) <D ( x a )
(
x i ,x a )
C
=
+ s i ∈μ j , ( s i ,s x ) ∈C = ,g ( s i ) = g ( s x ) 1+ s i ∈μ j , ( s i ,s x ) ∈C = ,g ( s i ) = g ( s x ) 1
|
μ j |
7.4.4 Probabilistic Penalty - PKM
The PKM algorithm allows constraints to be violated during clustering, but
enforces a probabilistic penalty of constraint violation. It is a special case of
the HMRF-KMeans algorithm, which is described in detail in Section 7.6 —
PKM is an ablation of HMRF-KMeans, doing constraint enforcement but not
performing distance learning.
Search WWH ::




Custom Search