Database Reference
In-Depth Information
gain knowledge of the concepts in this topic by building a crawler that collects
Twitter data in real time. The reader will then learn how to analyze this data to find
important time periods, users, and topics in their dataset. Finally, the reader will see
how all of these concepts can be brought together to perform visual analysis and
create meaningful software that uses Twitter data.
The code examples in this topic are written in Java ® , and JavaScript ® . Famil-
iarity with these languages will be useful in understanding the code, however the
examples should be straightforward enough for anyone with basic programming
experience. This topic does assume that you know the programming concepts
behind a high level language.
1.2
Learning Through Examples
Every concept discussed in this topic is accompanied by illustrative examples. The
examples in Chap. 4 use an open source network analysis library, JUNG™, 5 to
perform network computations. The algorithms provided in this library are often
highly optimized, and we recommend them for the development of production
applications. However, because they are optimized, this code can be difficult to
interpret for someone viewing these topics for the first time. In these cases, we
present code that focuses more on readability than optimization to communicate the
concepts using the examples. To build the visualizations in Chap. 5 ,weusethedata
visualization library D3™. 6 D3 is a versatile visualization toolkit, which supports
various types of visualizations. We recommend the readers to browse through the
examples to find other interesting ways to visualize Twitter data.
All of the examples read directly from a text file, where each line is a JSON
document as returned by the Twitter APIs (the format of which is covered in
Chap. 2 ). These examples can easily be manipulated to read from MongoDB ® ,but
we leave this as an exercise for the reader.
Whenever “...” appears in a code example, code has been omitted from the
example. This is done to remove code that is not pertinent to understanding the
concepts. To obtain the full source code used in the examples, refer to the topic's
website, http:// tweettracker.fulton.asu.edu/ tda .
The dataset used for the examples in this topic comes from the Occupy Wall
Street movement, a protest centered around the wealth disparity in the US. This
movement attracted significant focus on Twitter. We focus on a single day of this
event to give a picture of what these measures look like with the same data. The
dataset has been anonymized to remove any personally identifiable information.
This dataset is also made available on the topic's website for the reader to use when
executing the examples.
5 http://jung.sourceforge.net/
6 http://d3js.org
Search WWH ::




Custom Search