From: Adam W. on
I'm trying to scrape some historical data from NOAA's website, but I
can't seem to feed it the right form values to get the data out of
it. Heres the code:

import urllib
import urllib2

## The source page http://www.erh.noaa.gov/bgm/climate/bgm.shtml
url = 'http://www.erh.noaa.gov/bgm/climate/pick.php'
values = {'month' : 'July',
'year' : '1988'}

user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
headers = { 'User-Agent' : user_agent }

data = urllib.urlencode(values)
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
the_page = response.read()
print the_page
From: Jon Clements on
On 24 Sep, 22:18, "Adam W." <awasile...(a)gmail.com> wrote:
> I'm trying to scrape some historical data from NOAA's website, but I
> can't seem to feed it the right form values to get the data out of
> it.  Heres the code:
>
> import urllib
> import urllib2
>
> ## The source pagehttp://www.erh.noaa.gov/bgm/climate/bgm.shtml
> url = 'http://www.erh.noaa.gov/bgm/climate/pick.php'
> values = {'month' : 'July',
>           'year' : '1988'}
>
> user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
> headers = { 'User-Agent' : user_agent }
>
> data = urllib.urlencode(values)
> req = urllib2.Request(url, data, headers)
> response = urllib2.urlopen(req)
> the_page = response.read()
> print the_page

Hint:

<select name="month">
<option value="/jan">January</option>

<option value="/feb">February</option>
<option value="/mar">March</option>
<option value="/apr">April</option>
<option value="/may">May</option>
<option value="/jun">June</option>
<option value="/jul">July</option>

<option value="/aug">August</option>
<option value="/sep">September</option>
<option value="/oct">October</option>
<option value="/nov">November</option>
<option value="/dec">December</option>
</select>

Jon.