Python：urlopen 未下载整个网站

发布于 2024-09-13 22:56:48 字数 545 浏览 3 评论 0原文

问候，

我已经做到了：

import urllib

site = urllib.urlopen('http://www.weather.com/weather/today/Temple+TX+76504')
site_data = site.read()
site.close()

但它无法与在 Firefox 中加载时查看源代码相比。

我怀疑用户代理并执行了以下操作：

class AppURLopener(urllib.FancyURLopener):
    version = "Mozilla/5.0 (X11; U; Linux i686; zh-CN; rv:1.9.2.8) Gecko/20100722 Ubuntu/10.04 (lucid) Firefox/3.6.8"

urllib._urlopener = AppURLopener()

并下载了它，但它仍然没有下载整个网站。

如果这可能是罪魁祸首，有人可以帮我进行用户代理切换吗？

谢谢，纳尼

原文

Greetings,

I have done:

import urllib

site = urllib.urlopen('http://www.weather.com/weather/today/Temple+TX+76504')
site_data = site.read()
site.close()

but it doesn't compare to viewing the source when loaded in firefox.

I suspected the user agent and did this:

class AppURLopener(urllib.FancyURLopener):
    version = "Mozilla/5.0 (X11; U; Linux i686; zh-CN; rv:1.9.2.8) Gecko/20100722 Ubuntu/10.04 (lucid) Firefox/3.6.8"

urllib._urlopener = AppURLopener()

and downloaded it, but it still doesn't download the whole website.

Can someone please help me do user agent switching, if that is the likely culprit?

Thanks,
Narnie

分享到QQ

分享到微博