Database Reference
In-Depth Information
FIGURE 7.7
: DistBoost algorithm.
7.5.2 Kernel Distance Functions Using AdaBoost
Hertz et al. (27) proposed a method for distance metric learning by using
boosting in the product space of the input data space
X
. They posed the con-
strained metric learning problem as learning a function that took as input the
instances in the product space
X
X
, and output binary labels corresponding
to must-link (1) and cannot-link constraints (0). They used boosting on the
product space to learn this function, where boosting is a standard machine
learning tool that combines the strength of an ensemble of “weak” learners
(with low prediction accuracy) to create a “strong” learner (with high pre-
diction accuracy) (24). The overall flow of the
DistBoost
algorithm of Hertz
et al. (27) is outlined in Figure 7.7. In the first step, a constrained weighted
EM algorithm is run on the dataset and constraints, to fit a Gaussian Mixture
Model (GMM) over weighted unlabeled data and the given constraints. The
key difference of constrained EM from ordinary EM is the E-step, which sums
the assignment probabilities only over assignments that comply with the con-
straints. The output of the GMM is treated as a “weak” learner and is used
to learn a “weak” distance function, where the distance
h
(
x
1
,x
2
) between two
instances
x
1
and
x
2
is computed from their MAP assignment in the GMM as
follows:
×
Search WWH ::
Custom Search