Finding Sentences - Natural Language Processing with Java

Java Reference

In-Depth Information

Within sentences we may find numbers like 3.14159,

abbreviations such as found in Mr. Smith, and possibly

ellipses either within a sentence …, or at the end of a

sentence…

The output worked well for this paragraph. It caught both simple sentences and the more

complex sentences. Of course, text that is processed is not always perfect. The following

paragraph has extra spaces in some spots and is missing spaces where it needs them. This

problem is likely to occur in the analysis of chat sessions:

paragraph = " This sentence starts with spaces and ends

with "

+ "spaces . This sentence has no spaces between the

next "

+ "one.This is the next one.";

When we use this paragraph with the previous example, we get the following output:

This sentence starts with spaces and ends with spaces .

This sentence has no spaces between the next one.This is

the next one.

The leading spaces of the first sentence were removed, but the ending spaces were not.

The third sentence was not detected and was merged with the second sentence.

The getSentenceProbabilities method returns an array of doubles representing

the confidence of the sentences detected from the last use of the sentDetect method.

Add the following code after the for-each statement that displayed the sentences:

double probablities[] = detector.getSentenceProbabilities();

for (double probablity : probablities) {

System.out.println(probablity);

}

By executing with the original paragraph, we get the following output:

0.9841708738988814

0.908052385070974

0.9130082376342675

1.0

Search WWH ::

Custom Search

Home