Information Technology Reference
In-Depth Information
instantiates relation-specific generic extraction patterns into extraction rules
which find candidate facts. KnowItAll's Assessor then assigns a probability
to each candidate. The Assessor uses a form of Point-wise Mutual Informa-
tion (PMI) between phrases that is estimated from Web search engine hit
counts [5]. It computes the PMI between each fact and automatically gener-
ated discriminator phrases ( e.g. , “is a scanner” for the isA() relationship
in the context of the Scanner class). Given fact f and discriminator d ,the
computed PMI score is:
PMI( f, d )= Hits( d + f )
Hits( d ) Hits( f )
For example, a high PMI between “Epson 1200” and phrases such as “is a
scanner” suggests that “Epson 1200” is a Scanner instance. The PMI scores
are converted to binary features for a Naive Bayes Classifier, which outputs a
probability associated with each fact [4].
2.3.2 Finding Explicit Features
opine extracts explicit features for the given product class from parsed review
data. The system recursively identifies the parts and the properties of the given
product class and their parts and properties, in turn, continuing until no more
such features are found. The system then finds related concepts and extracts
their meronyms (parts) and properties. Table 2.1 shows that each feature type
contributes to the set of final features (averaged over seven product classes).
Table 2.1. Explicit Feature Information
Explicit Features Examples %Total
Properties ScannerSize 7%
Parts ScannerCover 52%
Features of Parts BatteryLife 24%
Related Concepts ScannerImage 9%
Related Concepts' Features ScannerImageSize 8%
Table 2.2. Meronymy Lexical Patterns Notation: [ C ] = product class (or
instance) [ M ] = candidate meronym ( ) = wildcard character
[ M ] of (*) [ C ] [ M ] for (*) [ C ]
[ C ]'s M [ C ] has (*) [ M ]
[ C ] with (*) [ M ] [ M ](*)in(*)[ C ]
[ C ] come(s) with (*) [ M ] [ C ] contain(s)(ing) (*) [ M ]
[ C ] equipped with (*) [ M ] [ C ] endowed with (*) [ M ]
In order to find parts and properties, opine first extracts the noun phrases
from reviews and retains those with frequency greater than an experimentally
set threshold. opine's Feature Assessor , which is an instantiation of Know-
ItAll's Assessor, evaluates each noun phrase by computing the PMI scores
Search WWH ::




Custom Search