Geoscience Reference
In-Depth Information
If semi-supervised learning helps in this information-integration task, one would expect that
subjects under condition 1 learn faster and more accurately than condition 2. This was not observed
in the VDR study. The average accuracy in each block of 100 trials is very similar under these two
conditions. Both increase gradually, from around 73% in the first block to around 93% in the 8-th
(last) block.
Therefore, unlabeled data did not affect learning in this experiment. There might be several
factors that contribute to the negative result. For one, the information-integration task in this VDR
study is considerably more difficult than the threshold task in the ZRQK study. Second, the VDR
study provides much more labeled data too. It may be that the effects of unlabeled data are largest
when labels are very sparse.
7.5 DISCUSSIONS
These studies, together with a growing body of recent work, reveal interesting similarities and
differences between human learning behavior and semi-supervised learning models. On simple
tasks, human's most confident instances depend on test data distribution, and decision boundary
judgments align with the low-density region between modes in the unlabeled distribution—just as
predicted by machine learning models. They clearly show that supervised learning alone does not
account well for human behavior in these tasks, and semi-supervised learning might be a better
explanation.
On the harder information-integration task, however, unlabeled data did not help. In addition,
in the ZRQK study, human judgments were less certain (i.e., shallower slopes in Figure 7.3(a))
than predicted by machine learning in Figure 7.4(a). These discrepancies may provide leverage for
understanding how these models might be adapted to better capture human behavior. For example,
we may consider several alternative models: a mixture of heavy-tailed student- t distributions instead
of Gaussians; the model's “memory” of previous items could be limited in cognitively-plausible
ways; or the model could update its estimates of the mixture coefficients and component parameters
sequentially as each new item is presented (e.g., online machine learning).
There are other questions raised by these studies. We may want to design experiments to
distinguish the different semi-supervised learning assumptions discussed in this topic. Furthermore,
what about a child's ability to point to an animal and ask Dad: “What is that?” It seems semi-
supervised learning and active learning (i.e., the setting in which the algorithm gets to choose which
instances are labeled) go hand-in-hand in human learning. By studying active learning in humans,
we may enhance the synergy between semi-supervised and active learning in machines. Ultimately,
these studies illustrate the promise of such cross-disciplinary research: congruency between model
predictions and human behavior in well-understood learning problems can shed insight into how
humans make use of labeled and unlabeled data. In addition, discrepancies between human and
machine-predicted behavior points the way toward the development of new machine learning mod-
els.
Search WWH ::




Custom Search