Database Reference
In-Depth Information
Figure 12-17. Tokens generated from Federalist Papers 5, 14, 17 and 18, with frequencies.
There are many tokens, which is not surprising considering the length of each essay we have fed
into the model. In Figure 12-17, we can see that some of our tokens appear in multiple
documents. Consider the word (or token) 'acquainted'. This term shows up one time each in
three of the four documents. How can we tell? The Total Occurrences for this token shows as 3,
and the Document Occurrences shows as 3, so it must be in each of three documents one time.
(Note that even a cursory review of these tokens reveals some stemming opportunities—for
example 'accomplish' and 'accomplished' or 'according' and 'accordingly'.) Click on the Total
Occurrences column twice to bring the most common terms to the top.
Search WWH ::

Custom Search