Python urllib 缓存

发布于 2024-11-25 13:00:49 字数 801 浏览 3 评论 0原文

我正在用 Python 编写一个脚本来确定它是否可以访问互联网。

import urllib

CHECK_PAGE     = "http://64.37.51.146/check.txt"
CHECK_VALUE    = "true\n"
PROXY_VALUE    = "Privoxy"
OFFLINE_VALUE  = ""

page = urllib.urlopen(CHECK_PAGE)
response = page.read()
page.close()

if response.find(PROXY_VALUE) != -1:
    urllib.getproxies = lambda x = None: {}
    page = urllib.urlopen(CHECK_PAGE)
    response = page.read()
    page.close()

if response != CHECK_VALUE:
    print "'" + response + "' != '" + CHECK_VALUE + "'" # 
else:
    print "You are online!"

我在计算机上使用代理，因此正确的代理处理很重要。如果它无法通过代理连接到互联网，它应该绕过代理并查看它是否卡在登录页面（就像我使用的许多公共热点一样）。使用该代码，如果我没有连接到互联网，第一个 read() 将返回代理的错误页面。但是当我之后绕过代理时，我得到了相同的页面。如果我在发出任何请求之前绕过代理，我会得到一个错误，就像我应该的那样。我认为 Python 从第一次就开始缓存页面。

如何强制 Python 清除其缓存（或者这是其他问题）？

原文

I'm writing a script in Python that should determine if it has internet access.

import urllib

CHECK_PAGE     = "http://64.37.51.146/check.txt"
CHECK_VALUE    = "true\n"
PROXY_VALUE    = "Privoxy"
OFFLINE_VALUE  = ""

page = urllib.urlopen(CHECK_PAGE)
response = page.read()
page.close()

if response.find(PROXY_VALUE) != -1:
    urllib.getproxies = lambda x = None: {}
    page = urllib.urlopen(CHECK_PAGE)
    response = page.read()
    page.close()

if response != CHECK_VALUE:
    print "'" + response + "' != '" + CHECK_VALUE + "'" # 
else:
    print "You are online!"

I use a proxy on my computer, so correct proxy handling is important. If it can't connect to the internet through the proxy, it should bypass the proxy and see if it's stuck at a login page (as many public hotspots I use do). With that code, if I am not connected to the internet, the first read() returns the proxy's error page. But when I bypass the proxy after that, I get the same page. If I bypass the proxy BEFORE making any requests, I get an error like I should. I think Python is caching the page from the 1st time around.

How do I force Python to clear its cache (or is this some other problem)?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

何止钟意 2024-12-02 13:00:49

每次调用 urllib.urlopen() 之前调用 urllib.urlcleanup() 即可解决问题。实际上，urllib.urlopen 将调用 urlretrive() 函数，该函数创建一个缓存来保存数据，而 urlcleanup() 将删除它。

回复收藏 0 原文

南巷近海 2024-12-02 13:00:49

您想要

page = urllib.urlopen(CHECK_PAGE, proxies={})

删除该

urllib.getproxies = lambda x = None: {}

行。

You want

page = urllib.urlopen(CHECK_PAGE, proxies={})

Remove the

urllib.getproxies = lambda x = None: {}

line.

回复收藏 0 原文

~没有更多了~

关于作者

沫离伤花

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

Python urllib 缓存

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

Python urllib 缓存

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。