Overview of Semi-Supervised Learning - Introduction to Semi-Supervised Learning

Geoscience Reference

In-Depth Information

18 CHAPTER 2. OVERVIEWOF SEMI-SUPERVISEDLEARNING

the well-separated cluster assumption and leads the algorithm astray. Clearly, self-training methods

such as propagating 1-nearest-neighbor are highly sensitive to outliers that may lead to propagating

incorrect information. In the case of the current example, one way to avoid this issue is to consider

more than the single nearest neighbor in both selecting the next point to label as well as assigning

it a label.

70

65

outlier

60

55

50

45

40

80

90

100

110

80

90

100

110

weight (lbs.)

(a)

(b)

70

65

60

55

50

45

40

80

90

100

110

80

90

100

110

weight (lbs.)

(c)

(d)

Figure 2.4: Propagating 1-nearest-neighbor illustration featuring an outlier: (a) after first few iterations,

(b,c) steps highlighting the effect of the outlier, (d) final labeling of all instances, with the entire rightmost

cluster mislabeled.

This concludes our basic introduction to the motivation behind semi-supervised learning,

and the various issues that a practitioner must keep in mind. We also showed a simple example of

semi-supervised learning to highlight the potential successes and failures. In the next chapter, we

discuss in depth a more sophisticated type of semi-supervised learning algorithm that uses generative

probabilistic models.

Introduction to Semi-Supervised Learning

Search WWH ::

Custom Search

Home