Graphics Programs Reference
In-Depth Information
The first thing you need to do is load the page that shows historical weather
information. The URL for historical weather in Buffalo on October 1, 2010,
follows:
www.wunderground.com/history/airport/KBUF/2010/10/1/DailyHistory
.html?req_city=NA&req_state=NA&req_statename=NA
If you remove everything after .html in the preceding URL, the same page
still loads, so get rid of those. You don't care about those right now.
www.wunderground.com/history/airport/KBUF/2010/10/1/DailyHistory.html
The date is indicated in the URL with /2010/10/1 . Using the drop-down
menu, change the date to January 1, 2009, because you're going to scrape
temperature for all of 2009. The URL is now this:
www.wunderground.com/history/airport/KBUF/2009/1/1/DailyHistory.html
everything is the same as the URL for October 1, except the portion
that indicates the date. It's /2009/1/1 now. Interesting. Without using the
drop-down menu, how can you load the page for January 2, 2009? Simply
change the date parameter so that the URL looks like this:
www.wunderground.com/history/airport/KBUF/2009/1/2/DailyHistory.html
Load the preceding URL in your browser and you get the historical sum-
mary for January 2, 2009. So all you have to do to get the weather for a
specific date is to modify the Weather Underground URL. Keep this in mind
for later.
Now load a single page with Python, using the urllib2 library by importing
it with the following line of code:
import urllib2
To load the January 1 page with Python, use the urlopen function.
page = urllib2.urlopen(“www.wunderground.com/history/airport/
KBUF/2009/1/1/DailyHistory.html”)
This loads all the HTML that the URL points to in the page variable. The
next step is to extract the maximum temperature value you're interested
Search WWH ::




Custom Search