Graphics Reference
In-Depth Information
From, To, CC, Date, Size
"Joe", "Zoe", "Tim", 12/09/2014, 156kb
"Joe", "Ben", "Ann; Tim; Zoe", 11/09/2014, 2048kb
"Joe", "Tim", "Ben; Zoe", 11/09/2014, 805kb
"Joe", , "Ben", 11/01/2014, 22kb
The node data is a list of all the unique e-mail participants, and the link
data is a list of all the occurrences of e-mails between two participants,
aggregated into a single link with an associated weight representing the
number of communications between the two participants. The approach to
process e-mail data programmatically into a graph is similar to the previous
example, with a bit more effort to prepare the data, and an extra step to
generate links:
1. Open the data file.
2. For each line in the data file,
a. Create a distribution list of all the people involved in the e-mail.
b. Add each person to a list of nodes, checking to make sure that nodes
are not duplicated.
c. For each pair of people, define a uniquely named link (that is,
source-target), and add that to a list of links, making sure that links
are not duplicated.
3. Write out the node file and link file.
Opening the file and starting the loop to process each row is similar to the
previous example, except the default comma delimiter is used.
with open ("emailSample.txt") as datafile:
datareader = csv.reader(datafile)
# skip the header row
next(datareader, None)
# process each row: add source node and target
node
for row in datareader:
Processing each row has a few more steps than before. Some data
preparation is done first. For example, the e-mail size, stored in row[4] , is
Search WWH ::




Custom Search