Lightweight Programming - Graph Analysis and Visualization

Graphics Reference

In-Depth Information

From, To, CC, Date, Size

"Joe", "Zoe", "Tim", 12/09/2014, 156kb

"Joe", "Ben", "Ann; Tim; Zoe", 11/09/2014, 2048kb

"Joe", "Tim", "Ben; Zoe", 11/09/2014, 805kb

"Joe", , "Ben", 11/01/2014, 22kb

The node data is a list of all the unique e-mail participants, and the link

data is a list of all the occurrences of e-mails between two participants,

aggregated into a single link with an associated weight representing the

number of communications between the two participants. The approach to

process e-mail data programmatically into a graph is similar to the previous

example, with a bit more effort to prepare the data, and an extra step to

generate links:

1. Open the data file.

2. For each line in the data file,

a. Create a distribution list of all the people involved in the e-mail.

b. Add each person to a list of nodes, checking to make sure that nodes

are not duplicated.

c. For each pair of people, define a uniquely named link (that is,

source-target), and add that to a list of links, making sure that links

are not duplicated.

3. Write out the node file and link file.

Opening the file and starting the loop to process each row is similar to the

previous example, except the default comma delimiter is used.

with open ("emailSample.txt") as datafile:

datareader = csv.reader(datafile)

# skip the header row

next(datareader, None)

# process each row: add source node and target

node

for row in datareader:

Processing each row has a few more steps than before. Some data

preparation is done first. For example, the e-mail size, stored in row[4] , is

Search WWH ::

Custom Search

Home