Java Reference
In-Depth Information
Words are combined into phrases and sentences. Sentence detection can be problematic
and is not as simple as looking for the periods at the end of a sentence. Periods are found
in many places including abbreviations such as Ms. and in numbers such as 12.834.
We often need to understand which words in a sentence are nouns and which are verbs.
We are sometimes concerned with the relationship between words. For example, Corefer-
ences resolution determines the relationship between certain words in one or more sen-
tences. Consider the following sentence:
"The city is large but beautiful. It fills the entire valley."
The word "it" is the coreference to city. When a word has multiple meanings we might
need to perform Word Sense Disambiguation to determine the meaning that was inten-
ded. This can be difficult to do at times. For example, "John went back home".
Does the home refer to a house, a city, or some other unit? Its meaning can sometimes be
inferred from the context in which it is used. For example, "John went back home. It was
situated at the end of a cul-de-sac."
Note
In spite of these difficulties, NLP is able to perform these tasks reasonably well in most
situations and provide added value to many problem domains. For example, sentiment
analysis can be performed on customer tweets resulting in possible free product offers for
dissatisfied customers. Medical documents can be readily summarized to highlight the rel-
evant topics and improved productivity.
Summarization is the process of producing a short description of different units. These
units can include multiple sentences, paragraphs, a document, or multiple documents. The
intent may be to identify those sentences that convey the meaning of the unit, determine
the prerequisites for understanding a unit, or to find items within these units. Frequently,
the context of the text is important in accomplishing this task.
Search WWH ::




Custom Search