Java Reference
In-Depth Information
[^.!?\s][^.!?]*(?:[.!?](?!['"]?\s|$)[^.!?]*)*[.!?]?['"]?(?=\s|$)
The comment in the following code sequence provides an explanation of what each part
represents:
Pattern sentencePattern = Pattern.compile(
"# Match a sentence ending in punctuation or EOS.\n"
+ "[^.!?\\s] # First char is non-punct, non-ws\n"
+ "[^.!?]* # Greedily consume up to punctuation.\n"
+ "(?: # Group for unrolling the loop.\n"
+ " [.!?] # (special) inner punctuation ok if\n"
+ " (?!['\"]?\\s|$) # not followed by ws or EOS.\n"
+ " [^.!?]* # Greedily consume up to punctuation.\n"
+ ")* # Zero or more (special normal*)\n"
+ "[.!?]? # Optional ending punctuation.\n"
+ "['\"]? # Optional closing quote.\n"
+ "(?=\\s|$)",
Pattern.MULTILINE | Pattern.COMMENTS);
Another representation of this expression can be generated using the display tool found at
http://regexper.com/ . As shown in the following diagram, it graphically depicts the ex-
pression and can clarify how it works:
The matcher method is executed against the sample paragraph and then the results are
displayed:
Matcher matcher = sentencePattern.matcher(paragraph);
while (matcher.find()) {
System.out.println(matcher.group());
}
Search WWH ::




Custom Search