当前位置：文江博客话题详情

如何获取网络上文件的真实 URL。（Python）

发布于 2024-08-11 17:33:12 字数 703 浏览 3 评论 0 原文

我注意到有时互联网上的音频文件有一个“假”URL。

http://garagaeband.com/3252243

这将 302 到真实的 URL：

http://garageband.com/michael_jackson4.mp3

我的问题是......当提供假 URL 时，如何从标头获取真实的 URL？

目前，这是我用于读取文件头的代码。我不知道这段代码能否实现我想要的目标。如何从响应标头中解析出“真实”URL？

import httplib
conn = httplib.HTTPConnection(head)
conn.request("HEAD",tail)
res = conn.getresponse()

这有一个 302 重定向： http://www.garageband.com/mp3cat/.UZCMYiqF7Kum /01_No_pierdas_la_fuente_del_gozo.mp3

原文

I notice that sometimes audio files on the internet have a "fake" URL.

http://garagaeband.com/3252243

And this will 302 to the real URL:

http://garageband.com/michael_jackson4.mp3

My question is...when supplied with the fake URL, how can you get the REAL URL from headers?

Currently, this is my code for reading the headers of a file. I don't know if this code will get me what I want to accomplish. How do I parse out the "real" URL From the response headers?

import httplib
conn = httplib.HTTPConnection(head)
conn.request("HEAD",tail)
res = conn.getresponse()

This has a 302 redirect:
http://www.garageband.com/mp3cat/.UZCMYiqF7Kum/01_No_pierdas_la_fuente_del_gozo.mp3

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

黯然#的苍凉 2024-08-18 17:33:12

使用 urllib.getUrl()

编辑：
抱歉，我有一段时间没有这样做了：

import urllib
urllib.urlopen(url).geturl()

例如：

>>> f = urllib2.urlopen("http://tinyurl.com/oex2e")
>>> f.geturl()
'http://www.amazon.com/All-Creatures-Great-Small-Collection/dp/B00006G8FI'
>>>

Use urllib.getUrl()

edit:
Sorry, I haven't done this in a while:

import urllib
urllib.urlopen(url).geturl()

For example:

>>> f = urllib2.urlopen("http://tinyurl.com/oex2e")
>>> f.geturl()
'http://www.amazon.com/All-Creatures-Great-Small-Collection/dp/B00006G8FI'
>>>

回复收藏 0 原文

一梦等七年七年为一梦 2024-08-18 17:33:12

Mark Pilgrim 建议在“httplib2。 org/http-web-services.html#httplib2-redirects" rel="nofollow noreferrer">深入了解 Python3" 因为它以更智能的方式处理许多事情（包括重定向）。

>>> import httplib2
>>> h = httplib2.Http()
>>> response, content = h.request("http://garagaeband.com/3252243")
>>> response["content-location"]
    "http://garageband.com/michael_jackson4.mp3"

Mark Pilgrim advises to use httplib2 in "Dive Into Python3" as it handles many things (including redirects) in a smarter way.

>>> import httplib2
>>> h = httplib2.Http()
>>> response, content = h.request("http://garagaeband.com/3252243")
>>> response["content-location"]
    "http://garageband.com/michael_jackson4.mp3"

回复收藏 0 原文

她比我温柔 2024-08-18 17:33:12

您必须读取响应，意识到您收到了 302（FOUND），并从响应标头中解析出真实的 URL，然后使用新的 URI 获取资源。

回复收藏 0 原文

椵侞 2024-08-18 17:33:12

我解决了答案。

 import urllib2
    req = urllib2.Request('http://' + theurl)
    opener = urllib2.build_opener()
    f = opener.open(req)
    print 'the real url is......' + f .url

I solved the answer.

 import urllib2
    req = urllib2.Request('http://' + theurl)
    opener = urllib2.build_opener()
    f = opener.open(req)
    print 'the real url is......' + f .url

回复收藏 0 原文

~没有更多了~