Geoscience Reference
In-Depth Information
36 CHAPTER 4. CO-TRAINING
Because these latter instances are not covered by the two labeled instances in our training
sample, a supervised learner will not be able to classify them correctly. It seems that a very large
labeled training sample is necessary to cover all the variations in location or person expressions. Or
is it?
4.2 CO-TRAINING
It turns out that one does not need a large labeled training sample for this task. It is sufficient to have
a large unlabeled training sample, which is much easier to obtain. Let us say we have the following
unlabeled instances:
instance 3:
... headquartered in (Kazakhstan) ...
instance 4:
... flew to (Kazakhstan) ...
instance 5:
... (Mr. Smith), a partner at Steptoe & Johnson ...
It is illustrative to inspect the features of the labeled and unlabeled instances together:
x ( 1 )
x ( 2 )
instance
y
1.
Washington State
headquartered in Location
2.
Mr. Washington
vice president
Person
3.
Kazakhstan
headquartered in ?
4.
Kazakhstan
flew to
?
5.
Mr. Smith
partner at
?
One may reason about the data in the following steps:
1. From labeled instance 1, we learn that “headquartered in” is a context that seems to indicate
y = Location .
2. If this is true, we infer that “Kazakhstan” must be a Location since it appears with the same
context “headquartered in” in instance 3.
3. Since instance 4 is also about “Kazakhstan,” it follows that its context “flew to” should indicate
Location .
4. At this point, we are able to classify “China” in “flew to (China)” as a Location , even though
neither “flew to” nor “China” appeared in the labeled data!
5. Similarly, by matching “Mr. *” in instances 2 and 5, we learn that “partner at” is a context for
y = Person . This allows us to classify “(Robert Jordan), a partner at ”as Person , too.
This process bears a strong resemblance to the self-training algorithm in Section 2.5, where a
classifier uses its most confident predictions on unlabeled instances to teach itself. There is a critical
difference, though: we implicitly used two classifiers in turn. They operate on different views of an
instance: one is based on the named entity string itself ( x ( 1 ) ), and the other is based on the context
 
Search WWH ::




Custom Search