Java Reference
In-Depth Information
Chapter 4. Finding People and Things
The process of finding people and things is referred to as Named Entity Recognition
( NER ). Entities such as people and places are associated with categories that have names,
which identify what they are. A named category can be as simple as "people". Common en-
tity types include:
• People
• Locations
• Organizations
• Money
• Time
• URLs
Finding names, locations, and various things in a document are important and useful NLP
tasks. They are used in many places such as conducting simple searches, processing quer-
ies, resolving references, the disambiguation of text, and finding the meaning of text. For
example, NER is sometimes interested in only finding those entities that belong to a single
category. Using categories, the search can be isolated to those item types. Other NLP tasks
use NER such as in POS taggers and in performing cross-referencing tasks.
The NER process involves two tasks:
• Detection of entities
• Classification of entities
Detection is concerned with finding the position of an entity within text. Once it is are loc-
ated, it is important to determine what type of entity was discovered. After these two tasks
have been performed, the results can be used to solve other tasks such as searching and de-
termining the meaning of the text. For example, identifying names from a movie or book
review and helping to find other movies or books that might be of interest. Extracting loca-
tion information can assist in providing references to nearby services.
Search WWH ::




Custom Search