如何检查 urllib2 是否遵循重定向?

发布于 2024-12-20 03:08:13 字数 775 浏览 3 评论 0原文

我写了这个函数:

def download_mp3(url,name):
        opener1 = urllib2.build_opener()
        page1 = opener1.open(url)
        mp3 = page1.read()
        filename = name+'.mp3'
        fout = open(filename, 'wb')
        fout.write(mp3)
        fout.close()

这个函数接受一个 url 和一个名称作为字符串。 然后将从 url 中下载并保存一个带有变量名称的 mp3。

url 的格式为 http://site/download.php?id=xxxx,其中 xxxx 是mp3

如果此 ID 不存在,网站会将我重定向到另一个页面。

那么问题来了:如何检查这个id是否存在?我试图用这样的函数检查网址是否存在:

def checkUrl(url):
    p = urlparse(url)
    conn = httplib.HTTPConnection(p.netloc)
    conn.request('HEAD', p.path)
    resp = conn.getresponse()
    return resp.status < 400

但它似乎不起作用..

谢谢

I've write this function:

def download_mp3(url,name):
        opener1 = urllib2.build_opener()
        page1 = opener1.open(url)
        mp3 = page1.read()
        filename = name+'.mp3'
        fout = open(filename, 'wb')
        fout.write(mp3)
        fout.close()

This function take an url and a name both as string.
Then will download and save an mp3 from the url with the name of the variable name.

the url is in the form http://site/download.php?id=xxxx where xxxx is the id of an mp3

if this id does not exist the site redirects me to another page.

So, the question is: how Can I check if this id exist? I've tried to check if the url exist with a function like this:

def checkUrl(url):
    p = urlparse(url)
    conn = httplib.HTTPConnection(p.netloc)
    conn.request('HEAD', p.path)
    resp = conn.getresponse()
    return resp.status < 400

But it's seems not working..

Thank you

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

鸵鸟症 2024-12-27 03:08:13

类似这样,检查代码:

import urllib2, urllib

class NoRedirectHandler(urllib2.HTTPRedirectHandler):
    def http_error_302(self, req, fp, code, msg, headers):
        infourl = urllib.addinfourl(fp, headers, req.get_full_url())
        infourl.status = code
        infourl.code = code
        return infourl
    http_error_300 = http_error_302
    http_error_301 = http_error_302
    http_error_303 = http_error_302
    http_error_307 = http_error_302

opener = urllib2.build_opener(NoRedirectHandler())
urllib2.install_opener(opener)
response = urllib2.urlopen('http://google.com')
if response.code in (300, 301, 302, 303, 307):
    print('redirect')

Something like this, and check code:

import urllib2, urllib

class NoRedirectHandler(urllib2.HTTPRedirectHandler):
    def http_error_302(self, req, fp, code, msg, headers):
        infourl = urllib.addinfourl(fp, headers, req.get_full_url())
        infourl.status = code
        infourl.code = code
        return infourl
    http_error_300 = http_error_302
    http_error_301 = http_error_302
    http_error_303 = http_error_302
    http_error_307 = http_error_302

opener = urllib2.build_opener(NoRedirectHandler())
urllib2.install_opener(opener)
response = urllib2.urlopen('http://google.com')
if response.code in (300, 301, 302, 303, 307):
    print('redirect')
通知家属抬走 2024-12-27 03:08:13

我对此的回答看起来像

req = urllib2.Request(url)
try:
   response = urllib2.urlopen(url)
except urllib2.HTTPError as e:
   # Do something about it
   raise HoustonWeHaveAProblem
else:
   if response.url != url:
       print 'We have redirected!'

My answer to this looked like

req = urllib2.Request(url)
try:
   response = urllib2.urlopen(url)
except urllib2.HTTPError as e:
   # Do something about it
   raise HoustonWeHaveAProblem
else:
   if response.url != url:
       print 'We have redirected!'
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文