Information Technology Reference
In-Depth Information
between the phrase and meronymy discriminators associated with the prod-
uct class (see Table 2.2). opine distinguishes parts from properties using
WordNet's IS-A hierarchy (which enumerates different kinds of properties)
and morphological cues ( e.g. , “-iness”, “-ity” su xes).
Given a target product class C , opine finds concepts related to C by
extracting frequent noun phrases as well as noun phrases linked to C or C 's
instances through verbs or prepositions ( e.g. ,“The scanner produces great
images ”). Related concepts are assessed as described in [6] and then stored as
product features together with their parts and properties.
2.3.3 Experiments: Explicit Feature Extraction
The previous review mining systems most relevant to our work are those in
[2] and [7]. We only had access to the data used in [2] and therefore our
experiments include a comparison between opine and Hu and Liu's system,
but no direct comparison between opine and IBM's SentimentAnalyzer [7]
(see the related work section for a discussion of this work).
Hu and Liu's system uses association rule mining to extract frequent re-
view noun phrases as features. Frequent features are used to find potential
opinion words (only adjectives) and the system uses WordNet synonyms and
antonyms in conjunction with a set of seed words in order to find actual opin-
ion words. Finally, opinion words are used to extract associated infrequent
features. The system only extracts explicit features.
On the five datasets used in [2], opine's precision is 22% higher than Hu's
at the cost of a 3% recall drop. There are two important differences between
opine and Hu's system: a) opine's Feature Assessor uses PMI assessment to
evaluate each candidate feature and b) opine incorporates Web PMI statistics
in addition to review data in its assessment. In the following, we quantify the
performance gains from a) and b).
a) In order to quantify the benefits of opine's Feature Assessor, we use
it to evaluate the features extracted by Hu's algorithm on review data. The
Feature Assessor improves Hu's precision by 6%.
b) In order to evaluate the impact of using Web PMI statistics, we assess
opine's features first on reviews, and then on reviews in conjunction with the
Web. Web PMI statistics increase precision by an average of 14.5%.
Overall, 1/3 of OPINE 's precision increase over Hu's system comes from
using PMI assessment on reviews and the other 2/3 from the use of the Web
PMI statistics.
In order to show that opine's performance is robust across multiple
product classes, we used two sets of 1,307 reviews downloaded from
tripadvisor.com for Hotels and amazon.com for Scanners. Two annotators
labeled a set of unique 450 opine extractions as correct or incorrect .The
inter-annotator agreement was 86%. The extractions on which the annotators
agreed were used to compute opine's precision, which was 89%. Furthermore,
the annotators extracted explicit features from 800 review sentences (400 for
Search WWH ::




Custom Search