Java Reference
In-Depth Information
Finding coreference resolution entities
Coreference resolution refers to the occurrence of two or more expressions in text that refer
to the same person or entity. Consider the following sentence:
"He took his cash and she took her change and together they bought their lunch."
There are several coreferences in this sentence. The word "his" refers to "He" and the word
"her" refers to "she". In addition, "they" refers to both "He" and "she".
An endophora is a coreference of an expression that either precedes it or follows it. Endo-
phora can be classified as anaphors or cataphors. In the following sentence, the word "It",
is the anaphor that refers to its antecedent, "the earthquake":
"Mary felt the earthquake. It shook the entire building."
In the next sentence, "she" is a cataphor as it points to the postcedent, "Mary":
"As she sat there, Mary felt the earthquake."
The Stanford API supports coreference resolution with the StanfordCoreNLP class us-
ing a dcoref annotation. We will demonstrate the use of this class with the previous sen-
tence.
We start with the creation of the pipeline and the use of the annotate method, as shown
here:
String sentence = "He took his cash and she took her change
"
+ "and together they bought their lunch.";
Properties props = new Properties();
props.put("annotators",
"tokenize, ssplit, pos, lemma, ner, parse, dcoref");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
Annotation annotation = new Annotation(sentence);
pipeline.annotate(annotation);
Search WWH ::




Custom Search