Mixture Models and EM - Introduction to Semi-Supervised Learning - page 31

Geoscience Reference

In-Depth Information

p(x | y = -1) = 5

=

0 . 2 ×

+ 0 . 8 ×

p(x | y =1)=1.25

p(x)=1

0

1

0

0.2

1

0

0.2

1

=

0 . 6

×

+

0 . 4

×

p(x | y =1)=2.5

p(x | y = -1) = 1.67

0

0.6

1

0

0.6

1

Figure 3.4: An example of unidentifiable models. Even if we know p(x) is a mixture of two uniform

distributions, we cannot uniquely identify the two components. For instance, the two mixtures produce

the same p(x) , but they classify x

0 . 5 differently. Note the height of each distribution represents a

probability density (which can be greater than 1), not probability mass. The area under each distribution

is 1.

=

it. Selecting a better θ ( 0 ) that is more likely to lead to the global optimum (or simply a better local

optimum) is another heuristic method, though this may require domain expertise.

Finally, we note that the goal of optimization for semi-supervised learning with mixture models

is to maximize the log likelihood (3.13). The EM algorithm is only one of several optimization

methods to find a (local) optimum. Direct optimization methods are possible, too, for example

quasi-Newton methods like L-BFGS [ 115 ].

3.6 CLUSTER-THEN-LABELMETHODS

We have used the EM algorithm to identify the mixing components from unlabeled data. Recall

that unsupervised clustering algorithms can also identify clusters from unlabeled data. This suggests

a natural cluster-then-label algorithm for semi-supervised classification.

Algorithm 3.9. Cluster-then-Label.

Input: labeled data ( x 1 ,y 1 ),...,( x l ,y l ), unlabeled data x l + 1 ,..., x l + u ,

a clustering algorithm

A

L

, and a supervised learning algorithm

1. Cluster x 1 ,..., x l + u using

.

2. For each resulting cluster, let S be the labeled instances in this cluster:

3.

A

If S is non-empty, learn a supervised predictor from S: f S = L

(S).

Apply f S to all unlabeled instances in this cluster.

4. If S is empty, use the predictor f trained from all labeled data.

Output: labels on unlabeled data y l + 1 ,...,y l + u .

Next Page

Introduction to Semi-Supervised Learning

Search WWH ::

Custom Search

Home