Information Technology Reference
In-Depth Information
their antecedents [2]. For instance, when performing pronoun coreferencing, syntac-
tic agreement based on person, gender and number limits our search for a noun
phrase linked to a pronoun to a few candidates in the text. In addition, consistency
restrictions limit our search to a precise text span (the previous sentence, the pre-
ceding text in the current sentence, or the previous and current sentence) depending
upon whether the pronoun is personal, possessive, reflective, and what is its person.
In the sentence “John works by himself,” “himself” must refer to John, whereas in
“John bought him a new car,” “him” must refer to some other individual mentioned
in a previous sentence. In the sentence, ““You have not been sending money,” John
said in a recent call to his wife from Germany,” binding theory constraints limit pro-
noun resolution to first and second persons within a quotation (e.g., you), and the
candidate antecedent to a noun outside the quotation, which fits the grammatical
role of object of a verb or argument of a preposition (e.g., wife). Our coreferencing
and anaphora resolution models also benefit from preferential weighting based on
dependency attributes. The candidate antecedents that appear closer to a pronoun
in the text are scored higher (weighting by referential distance). Subject is favored
over object, except for accusative pronouns (weighting by syntactic position). A head
noun is favored over its modifiers (weighting by head label). In addition, as part of
the normalization process, we apply a transformational grammar to map multiple
surface structures into an equivalent deep structure. A common example is the nor-
malization of a dependency structure involving a passive verb form into the active,
and recognition of the deep subject of such clause. At the more pragmatic level, we
apply rules to normalize composite verb expressions, capture explicit and implicit
negations, or to verbalize noun or adjectives in cases where they convey action sense
in preference to the governing verb of a clause. For instance, the sentences “Bill did
not visit Jane,” which contains an explicit negation, and “Bill failed to visit Jane,”
where the negation is rendered by a composite verb expression, are mapped to the
same structure.
5.2.2 Storage
The output of a deep parser is a complex augmented tree structure that usually does
not lend itself to a tractable indexing schema for cross-document search. Therefore,
we have developed a set of rules for converting an augmented tree representation
into a scalable data storage structure.
In a dependency tree, every word in the sentence is a modifier of exactly one
other word (called its head), except the head word of the sentence, which does not
have a head. We use a list of tuples to specify a dependency tree with the following
format:
(Label Modifier Root POS Head-label Role Antecedent [Attributes])
where: Label is a unique numeric ID; Modifier is a term in the sentence; Root
is the root form (or category) of the modifier; POS is its lexical category; Head-
label is the ID of the term that modifier modifies; Role specifies the type of de-
pendency relationship between head and modifier, such as subject, complement, etc;
Antecedent is the antecedent of the modifier; Attributes is the list of semantic
attributes that may be associated with the modifier, e.g., person's name, location,
time, number, date, etc.
Search WWH ::




Custom Search