Overview of Semi-Supervised Learning - Introduction to Semi-Supervised Learning

Geoscience Reference

In-Depth Information

BIBLIOGRAPHICAL NOTES

Semi-supervised learning is a maturing field with extensive literature. It is impossible to cover all

aspects of semi-supervised learning in this introductory topic. We try to select a small sample of

widely used semi-supervised learning approaches to present in the next few chapters, but have to

omit many others due to space. We provide a glimpse to these other approaches in Chapter 8.

Semi-supervised learning is one way to address the scarcity of labeled data. We encourage

readers to explore alternative ways to obtain labels. For example, there are ways to motivate human

annotators to produce more labels via computer games [ 177 ], the sense of contribution to citizen

science [ 165 ], or monetary rewards [ 3 ].

Multiple researchers have informally noted that semi-supervised learning does not always help.

Little is written about it, except a few papers like [ 48 , 64 ]. This is presumably due to “publication

bias,” that negative results tend not to be published. A deeper understanding of when semi-supervised

learning works merits further study.

Yarowsky's word sense disambiguation algorithm [ 191 ] is a well-known early example of self

training. There are theoretical analyses of self-training for specific learning algorithms [ 50 , 80 ].

However, in general self-training might be difficult to analyze. Example applications of self-training

can be found in [ 121 , 144 , 145 ].

Search WWH ::

Custom Search

Home