Facebook 流 API 错误在浏览器中有效,但在服务器端无效
如果我在浏览器中输入此 URL,它会返回我有兴趣抓取的有效 XML 数据。
http://www.facebook.com/ajax/stream/profile.php?__a=1&profile_id=36343869811&filter=2&max_time=0&try_scroll_load=false&_log_clicktype=Filter%20Stories%20or%20Pagination&ajax_log=0
但是,如果我从服务器端执行此操作,则它不会像以前那样工作。现在它只是返回此错误,这似乎是默认错误消息
{u'silentError': 0, u'errorDescription': u"Something went wrong. We're working on getting it fixed as soon as we can.", u'errorSummary': u'Oops', u'errorIsWarning': False, u'error': 1357010, u'payload': None}
,这里是有问题的代码,我尝试了多个用户代理,但无济于事:
import urllib2
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 6.1; he; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3'
uaheader = { 'User-Agent' : user_agent }
wallurl='http://www.facebook.com/ajax/stream/profile.php?__a=1&profile_id=36343869811&filter=2&max_time=0&try_scroll_load=false&_log_clicktype=Filter%20Stories%20or%20Pagination&ajax_log=0'
req = urllib2.Request(wallurl, headers=uaheader)
resp = urllib2.urlopen(req)
pageData=convertTextToUnicode(resp.read())
print pageData #and get that error
除了用户之外,服务器调用和我自己的浏览器之间有什么区别代理和IP地址?
If I enter this URL in a browser it returns to me the valid XML data that I am interested in scraping.
http://www.facebook.com/ajax/stream/profile.php?__a=1&profile_id=36343869811&filter=2&max_time=0&try_scroll_load=false&_log_clicktype=Filter%20Stories%20or%20Pagination&ajax_log=0
However, if I do it from the server-side, it doesn't work as it previously did. Now it just returns this error, which seems to be the default error message
{u'silentError': 0, u'errorDescription': u"Something went wrong. We're working on getting it fixed as soon as we can.", u'errorSummary': u'Oops', u'errorIsWarning': False, u'error': 1357010, u'payload': None}
here is the code in question, I've tried multiple User Agents, to no avail:
import urllib2
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 6.1; he; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3'
uaheader = { 'User-Agent' : user_agent }
wallurl='http://www.facebook.com/ajax/stream/profile.php?__a=1&profile_id=36343869811&filter=2&max_time=0&try_scroll_load=false&_log_clicktype=Filter%20Stories%20or%20Pagination&ajax_log=0'
req = urllib2.Request(wallurl, headers=uaheader)
resp = urllib2.urlopen(req)
pageData=convertTextToUnicode(resp.read())
print pageData #and get that error
What would be the difference between the server calls and my own browser aside from User Agents and IP addresses?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我在 chrome 和 firefox 中都尝试了上面的网址。它可以在 Chrome 上运行,但在 Firefox 上失败。在 Chrome 上,我登录了 Facebook,而在 Firefox 上,我没有登录。
这可能是造成这种差异的原因。您需要在您发布的基于 urllib2 的脚本中提供身份验证。
有一个现有问题使用 urllib2 进行身份验证。
I tried the above url in both chrome and firefox. It works on chrome but fails on firefox. On chrome, I am signed into facebook while on Firefox, I am not.
This could be the reason for this discrepancy. You will need to provide authentication in your urllib2 based script that you have posted.
There is a existing question on authentication with urllib2.