Database Reference
In-Depth Information
tag.setValue(words.get(key));
tags.add(tag);
}
Collections.sort(tags, Collections.reverseOrder());
// Step 3: Return the first K words
...
return cloudwords;
}
Source: Chapter5/text/ExtractTopKeywords.java
To prevent information overload, we generally choose the top K words to create
a word cloud using the method DrawWordCloud summarized in Listing 5.11 .An
example of a word cloud containing the top 60 most frequent words from the sample
dataset is presented in Fig. 5.11 . The word cloud effectively highlights the key events
of the day, which consisted of mass arrests of protesters of the Occupy Wall Street
movement by the NYPD in Zuccotti Park.
5.4.2
Adding Context to Word Clouds
Word clouds are effective in summarizing text. However, they place the responsi-
bility of understanding the context of usage of these words on the reader. This is
often not straightforward due to the limited information present in the word clouds.
For example, if two words are used with relatively similar frequency, they are both
highlighted equally in the visualization. However, a reader cannot determine if the
words were used together or separately. This problem can be alleviated by using
another dimension to add context to word clouds. Here we show how to use the
time of usage of words to create a visualization with more context. To demonstrate
this idea, we pick the top keywords observed in the word cloud in Fig. 5.11 and
organize them into five broad topics as follows:
Fig. 5.11 An example of a
word cloud containing the top
60 words
 
Search WWH ::




Custom Search