Extracting Product Features and Opinions from Reviews - Natural Language Processing and Text Mining

Information Technology Reference

In-Depth Information

A k ) ( m ) term quantifies the influence of a particular label

assignment to w 's neighborhood over w 's label. In the following, we describe

how we estimate this term.

Neighborhood Features Each type of word relationship which con-

strains the assignment of SO labels to words (synonymy, antonymy, conjunc-

tion, morphological relations, etc.) is mapped by opine to a neighborhood

feature. This mapping allows opine to simultaneously use multiple indepen-

dent sources of constraints on the label of a particular word. In the following,

we formalize this mapping.

Let T denote the type of a word relationship in R and let A k,T represent

the labels assigned by A k to neighbors of a word w which are connected to w

The P ( l ( w )= L

through a relationship of type T .Wehave A k = T

A k,T

and

P ( l ( w )= L|A k ) ( m ) = P ( l ( w )= L|

A k,T ) ( m )

For each relationship type T ,

opine

defines a neighborhood feature

f T ( w, L, A k,T ) which computes P ( l ( w )= L

A k,T ), the probability that w 's

| T

label is L given A k,T (see below). P ( l ( w )= L

A k,T ) ( m ) is estimated com-

bining the information from various features about w 's label using the sigmoid

function σ ():

P ( l ( w )= L|A k ) ( m ) = σ (

f i ( w, L, A k,i ) ( m ) ∗ c i )

where c 0 , ...c j are weights whose sum is 1 and which reflect opine 's confidence

in each type of feature.

Given word w ,label L , relationship type T and neighborhood label as-

signment A k ,let N T represent the subset of w 's neighbors connected to w

through a type T relationship. The feature f T computes the probability that

w 's label is L given the labels assigned by A k to words in N T . Using Bayes's

Law and assuming that these labels are independent given l ( w ), we have the

following formula for f T

at iteration m :

T |

f T ( w, L, A k,T ) ( m ) = P ( l ( w )= L ) ( m ) ∗

P ( L j |l ( w )= L )

P ( L j |

l ( w )= L ) is the probability that word w j has label L j if w j and w are

linked by a relationship of type T and w has label L .Wemakethesimpli-

fying assumption that this probability is constant and depends only on T , L

and L j , not on the particular words w j and w . For each tuple ( T , L , L j ),

L, L j ∈{pos, neg, neutral} , opine builds a probability table using a small set

of bootstrapped positive, negative and neutral words.

Finding (Word, Feature) SO Labels

This subtask is motivated by the existence of frequent words which change

their SO label based on associated features, but whose SO labels in the context

Natural Language Processing and Text Mining

Search WWH ::

Custom Search

Home