Java Reference
In-Depth Information
"(?!')"
indicates that the rest of the regular expression should ignore apostro-
phes (such as in a contraction, like
"you'll"
) and "
\\p{P}"
matches any punc-
tuation character. For any match, the call to
replaceAll
removes the
punctuation by replacing it with an empty
String
. The result of line 21 is an in-
termediate
Stream<String>
containing the lines without punctuation.
• i e 2 s s
Stream
method
flatMap
to break each line of text into its separate
words. Method
flatMap
receives a
Function
that maps an object into a stream of
elements. In this case, the object is a
String
containing words and the result is
another intermediate
Stream<String>
for the individual words. The lambda in
line 22 passes the
String
representing a line of text to
Pattern
method
split-
AsStream
(new in Java SE 8), which uses the regular expression specified in the
Pattern
(line 16) to tokenize the
String
into its individual words.
•
Lines 23-24 use
Stream
method
collect
to count the frequency of each word
and place the words and their counts into the
TreeMap<String,
Long>
. Here, we
use a version of
Collectors
method
groupingBy
that receives three arguments—
a classifier, a
Map
factory and a downstream
Collector
. The classifier is a
Func-
tion
that returns objects for use as keys in the resulting
Map
—the method refer-
ence
String::toLowerCase
converts each word in the
Stream<String>
to
lowercase. The
Map
factory is an object that implements interface
Supplier
and
returns a new
Map
collection—the
constructor reference
TreeMap::new
returns a
TreeMap
that maintains its keys in sorted order.
Collectors.counting()
is the
downstream
Collector
that determines the number of occurrences of each key
in the stream.
Displaying the Summary Grouped by Starting Letter
Next, lines 27-37 group the key-value pairs in the
Map
wordCounts
by the keys' first letter.
This produces a new
Map
in which each key is a
Character
and the corresponding value is
a
List
of the key-value pairs in
wordCounts
in which the key starts with the
Character
.
The statement performs the following tasks:
•
First we need to get a
Stream
for processing the key-value pairs in
wordCounts
.
Interface
Map
does not contain any methods that return
Stream
s. So, line 27 calls
Map
method
entrySet
on
wordCounts
to get a
Set
of
Map.Entry
objects that each
contain one key-value pair from
wordCounts
. This produces an object of type
Set<Map.Entry<String,
Long>>
.
•
Line 28 calls
Set
method
stream
to get a
Stream<Map.Entry<String,
Long>>
.
•
Lines 29-31 call
Stream
method
collect
with three arguments—a classifier, a
Map
factory and a downstream
Collector
. The classifier
Function
in this case
gets the key from the
Map.Entry
then uses
String
method
charAt
to get the key's
first character—this becomes a
Character
key in the resulting
Map
. Once again,
we use the constructor reference
TreeMap::new
as the
Map
factory to create a
TreeMap
that maintains its keys in sorted order. The downstream
Collector
(
Collectors.toList()
) places the
Map.Entry
objects into a
List
collection. The
result of
collect
is a
Map<Character,
List<Map.Entry<String, Long>>>
.