Geoscience Reference
In-Depth Information
70 CHAPTER 7. HUMAN SEMI-SUPERVISEDLEARNING
￿ Prediction function f : the concepts formed in the human mind. The function itself is not
directly observable. However, it is possible to observe any particular prediction f( x ) , i.e., how
the human classifies stimulus x .
In many real world situations, humans are exposed to a combination of labeled data and far
more unlabeled data when they need to make a classification decision: an airport security officer must
decide whether a piece of luggage poses a threat, where the labeled data comes from job training,
and the unlabeled data comes from all the luggage passing through the checkpoint; a dieter must
decide which foods are healthy and which are not, based on nutrition labels and advertisements; a
child must decide which names apply to which objects from Dad's instructions and observations of
the world around her. Some questions naturally arise:
￿ When learning concepts, do people make systematic use of unlabeled data in addition to
labeled data?
￿ If so, can such usage be understood with reference to semi-supervised learning models in
machine learning?
￿ Can study of the human use of labeled and unlabeled data improve machine learning in
domains where human performance outstrips machine performance?
Understanding how humans combine information from labeled and unlabeled data to draw in-
ferences about conceptual boundaries can have significant social impact, ranging from improving
education to addressing homeland security issues. Standard psychological theories of concept learn-
ing have focused mainly on supervised learning. However, in the realistic setting where labeled and
unlabeled data is available, semi-supervised learning offers very explicit computational hypotheses
that can be empirically tested in the laboratory. In what follows, we cite three studies that demonstrate
the complexity of human semi-supervised learning behaviors.
7.2
STUDY ONE: HUMANS LEARN FROMUNLABELEDTEST
DATA
Zaki and Nosofsky conducted a behavioral experiment that demonstrates the influence of unlabeled
test data on learning [ 194 ]. In short, unlabeled test data continues to change the concept originally
learned in the training phase, in a manner consistent with self-training. For machine learning re-
searchers, it may come as a mild surprise that humans may not be able to hold their learned function
fixed during testing.
We present the Zaki and Nosofsky study in machine learning terms. The task is one-class
classification or outlier detection: the human subject is first presented with a training sample
{ ( x i ,y i =
i = 1 . Note importantly she is told that all the training instances come from one class.
Then, during testing, the subject is shown u unlabeled instances
1 ) }
l + u
{
x i }
i = l + 1 , and her task is to clas-
sify each instance as y i =
1 or not. This is usually posed as a density level-set problem in machine
Search WWH ::




Custom Search