Database Reference
In-Depth Information
DEPLOYMENT
Gillian had an interest in investigating the similarities and differences between several of the
Federalist Papers in order to lend credence to the belief that Alexander Hamilton and James
Madison collaborated on paper 18.
Figure 12-27. Final cluster results after training our text mining model to
recognize John Jay's writing style.
Gillian now has the evidence she had hoped to find. As we continued to train our model in John
Jay's writing style, we have found that he indeed was consistent from paper 3 to 4 to 5, as
RapidMiner found these documents to be the most similar and subsequently clustered them
together in cluster_1. At the same time, RapidMiner consistently found paper 18, the suspected
collaboration between Hamilton and Madison to be associated with one, then the other, and finally
both of them together. Gillian could further strengthen her model by adding additional papers
from all three authors, or she could go ahead and add what we've already found to her exhibit at
the museum.
CHAPTER SUMMARY
Text mining is a powerful way of analyzing data in an unstructured format such as in paragraphs of
text. Text can be fed into a model in different ways, and then that text can be broken down into
tokens. Once tokenized, words can be further manipulated to address matters such as case
sensitivity, phrases or word groupings, and word stems. The results of these analyses can reveal
 
 
Search WWH ::




Custom Search