Visualizing Twitter Data - Twitter Data Analytics - page 50

Database Reference

In-Depth Information

interesting by adding context to the network. During an event, a user would want

to analyze the event from different perspectives. For example, in a natural disaster

like a thunderstorm, instead of analyzing all the retweets related to the disaster, a

first responder might be more interested in reports of damage or flooding. Thus,

by filtering the information, we can make the visualization manageable while

enhancing the ability to focus on topics of interest.

To achieve this, we define groups of words called topics. For example, we

create topic 1 with “#zuccotti” (Light) and topic 2 with “#nypd” (Dark). We

can now generate a retweet network consisting of people who retweeted text

matching these topics. Before we visualize the network, we must first extract and

format it. This can be done using the method ConvertTweetsToDiffusionPath in class

CreateD3Network , which is summarized in Listing 5.2 .

The extracted network can be visualized using the method create_network , which

is summarized in Listing 5.3 . Figure 5.2 shows the visualization of the top five most

frequently retweeted nodes and those who retweeted them on topics 1 and 2. The

size of a node indicates its importance in the network. Larger nodes have been

retweeted more often than smaller nodes. Nodes are colored according to their

topic preference. The links are directed and show the flow of information. Here,

not only can we identify important information producers (large nodes) as well

as information consumers (nodes with a large number of inlinks). Additionally,

the network shows that people retweet across topics, which is evident from the

connections between the users.

Listing 5.2

Extracting the retweet network

public JSONObject ConvertTweetsToDiffusionPath(String inFilename

,int numNodeClasses, JSONObject hashtags, int num_nodes) {

//Step 1: Read through the file and process Tweets

matching the topics

...

//Step 2: Identify the size of the nodes based on the

number of times they are retweeted

ArrayList<NetworkNode> nodes = ComputeGroupsSqrt(

returnnodes, max, min, numNodeClasses);

...

/ ** Step 3

* Prune the network to keep only the top |

nodes_to_visit| nodes in the network.

* Recursively visit all top nodes and retain their

connections.

* /

for(int k=0;k<nodes_to_visit;k++) {

NetworkNode nd = nodes.get(k);

nd.level = 0;

HashMap<String,NetworkNode> rtnodes =

GetNextHopConnections(userconnections,nd,new

HashMap<String,NetworkNode>());

...

/ ** Step 4: Compact the nodes of the network by removing

* all nodes who have never been retweeted

* /

Next Page

Twitter Data Analytics

Search WWH ::

Custom Search

Home