查询后无法获取整个网页
我正在尝试抓取此页面上找到的历史 NAVPS 表:
http://www.philequity.net/pefi_historicalnavps.php
这里所有的代码都是以下内容我的最小工作脚本。所以它开始于:
import urllib
import urllib2
from BeautifulSoup import BeautifulSoup
opener = urllib2.build_opener()
urllib2.install_opener(opener)
使用 Chrome 的 Inspect Element 研究网页后,我发现发送的表单数据如下:
form_data = {}
form_data['mutualFund'] = '1'
form_data['year'] = '1995'
form_data['dmonth'] = 'Month'
form_data['dday'] = 'Day'
form_data['dyear'] = 'Year'
因此我继续构建请求:
url = "http://www.philequity.net/pefi_historicalnavps.php"
params = urllib.urlencode(form_data)
request = urllib2.Request(url, params)
我希望这相当于填写后单击“获取 NAVPS”形式:
page = urllib2.urlopen(request)
然后我用 BeautifulSoup
阅读它:
soup = BeautifulSoup(page.read())
print soup.prettify()
但是唉!我只获得网页,就好像我没有单击“获取 NAVPS”:(
我错过了什么吗?服务器是否在单独的流中发送表?我如何访问它?
I'm trying to scrape the historical NAVPS tables found on this page:
http://www.philequity.net/pefi_historicalnavps.php
All the code here are the contents of my minimal working script. So it starts with:
import urllib
import urllib2
from BeautifulSoup import BeautifulSoup
opener = urllib2.build_opener()
urllib2.install_opener(opener)
After studying the web page using Chrome's Inspect Element, I find that the Form Data sent are the following:
form_data = {}
form_data['mutualFund'] = '1'
form_data['year'] = '1995'
form_data['dmonth'] = 'Month'
form_data['dday'] = 'Day'
form_data['dyear'] = 'Year'
So I continue building up the request:
url = "http://www.philequity.net/pefi_historicalnavps.php"
params = urllib.urlencode(form_data)
request = urllib2.Request(url, params)
I expect this to be the equivalent of clicking "Get NAVPS" after filling in the form:
page = urllib2.urlopen(request)
Then I read it with BeautifulSoup
:
soup = BeautifulSoup(page.read())
print soup.prettify()
But alas! I only get the web page as though I didn't click "Get NAVPS" :(
Am I missing something? Is the server sending the table in a separate stream? How do I get to it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
当我查看 firebug 中的 POST 请求时,我发现您还没有传递一个参数:“type”是“Year”。我不知道这是否能为您提供数据,还有许多其他原因可能无法为您提供数据。
When I look at the POST request in firebug, I see one more parameter that you aren't passing: "type" is "Year". I don't know if this will get the data for you, there's any number of other reasons it might not serve you the data.