Database Reference
In-Depth Information
initializing clusters and inferring clustering constraints based on neigh-
borhoods derived from labeled examples (5).
Constraint-based clustering techniques have been an active topic of re-
search, where recent techniques include variational techniques (28) or sam-
pling methods (36) for constrained clustering using a graphical model, and fea-
sibility studies for clustering under different types of constraints (16). There
have typically been two types of constraint-based approaches: (1) ones with
strict enforcement, which find the best feasible clustering respecting all the
given constraints (47; 15), and (2) ones with partial enforcement, which find
the best clustering while maximally respecting constraints (6; 43; 16; 28).
Figure 7.2 shows an example of a clustering which respects all the given con-
straints in Figure 7.1. Details of these algorithms are outlined in later sections.
FIGURE 7.1 : Input instances and constraints.
7.2.2 Distance-Based Methods
In distance-based approaches, an existing clustering algorithm that uses
a distance measure is employed. However, rather than use a given distance
metric, the distance measure is first trained to “satisfy” the given constraints.
In this context, satisfying the constraints means that must-linked (similar)
instances are close together and cannot-linked (different) instances are far
apart in the learned distance space. Several distance measures have been
used for distance-based constrained clustering:
string-edit distance trained using EM (8),
Jensen-Shannon divergence trained using gradient descent (12),
Search WWH ::




Custom Search