Geoscience Reference
In-Depth Information
the middle,” that is, near the decision boundary. More interesting is the mean response time on test-
2 for L-subjects (solid line) and R-subjects (dashed line). L-subjects have a response time plateau
around x
=−
0 . 1, which is left-shifted compared to test-1, whereas R-subjects have a response time
=
peak around x
0 . 6, which is right-shifted. The response times thus also suggest that exposure to
the unlabeled data has shifted the decision boundary.
The ZRQK study explains the human behavior with a Gaussian Mixture Model (see Chap-
ter 3). They fit the model with the EM algorithm. After seeing the labeled data plus test-1 (since
these are what the human subjects see when they make decisions on test-1), the Gaussian Mixture
Model predicts p(y = 1 |
x ) as the dotted line in Figure 7.4(a). Then, after exposure to unlabeled
data, the Gaussian Mixture Models fit on all data (labeled, test-1, unlabeled, and test-2) predicts
decision boundary shift for the left-shift condition (solid line) and right-shift condition (dashed
line) in Figure 7.4(a), which qualitatively explains the observed human behavior.
These Gaussian Mixture Models can also explain the observed response time. Let the response
time model be
aH( x ) + b i ,
(7.4)
for test- i, i =
1 , 2. H( x ) is the entropy of the prediction
y =− 1 , 1 p(y |
H( x ) =
x ) log p(y |
x ),
(7.5)
which is zero for confident predictions p(y |
x ) =
0 . 5. The parameters a = 168, b 1 = 688, b 2 = 540, obtained with least squares from the empirical
data, produce the fit in Figure 7.4(b), which explains the empirical peaks before and after seeing
unlabeled data in Figure 7.3(b).
x ) =
0 or 1, and one for uncertain predictions p(y |
7.4 STUDYTHREE: ABSENCEOF HUMAN
SEMI-SUPERVISEDLEARNING INA COMPLEXTASK
The previous two sections discuss positive studies where human semi-supervised learning behavior
is observed. In this section, we present a study by Vandist, De Schryver and Rosseel (VDR), which
is a negative result [ 175 ]. The task is again binary classification. However, the feature space is two-
dimensional. Each class is distributed as a long, thin Gaussian tilted at 45 angle. The true decision
boundary is therefore along the diagonal, as shown in Figure 7.5(a). Such non-axis-parallel decision
boundaries are called “information-integration tasks” in psychology since the learner has to integrate
information from two features [ 7 ]. They are considered to be more complex and difficult to learn,
because there is no verbal analogue to a univariate rule.
In the VDR study, the stimuli are Gabor patches similar to those in Figure 7.5(b), where the
two features are frequency and orientation of the “gratings.” We discuss one of their experiments
that is particularly relevant to human semi-supervised learning. In the experiment, there are two
conditions:
Search WWH ::




Custom Search