Handling Data - Data Points: Visualization That Means Something

Graphics Programs Reference

In-Depth Information

print '<date>' + row[0] + '</date>'

print '<max_temperature>' + row[1] + '</max_temperature>'

print '</observation>'

each row has two values: the date and the maximum temperature.

end the XML conversion with its closing tag.

print '</weather_data>'

Two main things are at play here. First, you read the data in, and then you

iterate over the data, changing each row in some way. It's the same logic

if you were to convert the resulting XML back to CSV. As shown in the fol-

lowing snippet, the difference is that you use a different module to parse

the XML file.

from BeautifulSoup import BeautifulStoneSoup

f = open('wunder-data.xml', 'r')

xml = f.read()

soup = BeautifulStoneSoup(xml)

observations = soup.findAll('observation')

for o in observations:

print o.date.string + “,” + o.max_temperature.string

The code looks different, but you're basically doing the same thing.

Instead of importing the csv module, you import BeautifulStoneSoup from

BeautifulSoup. Remember you used BeautifulSoup to parse the HTML from

Weather Underground. BeautifulStoneSoup parses the more general XML.

You can open the XML file for reading with open() and then load the con-

tents in the xml variable. At this point, the contents are stored as a string.

To parse, pass the xml string to BeautifulStoneSoup to iterate through each

<observation> in the XML file. Use findAll() to fetch all the observations,

and finally, like you did with the CSV to XML conversion, loop through each

observation, printing the values in your desired format.

This takes you back to where you began:

20090101,26

20090102,34

20090103,27

20090104,34

...

Search WWH ::

Custom Search

Home