如何获取网络上文件的真实 URL。 (Python)

发布于 2024-08-11 17:33:12 字数 703 浏览 3 评论 0 原文

我注意到有时互联网上的音频文件有一个“假”URL。

http://garagaeband.com/3252243

这将 302 到真实的 URL:

http://garageband.com/michael_jackson4.mp3

我的问题是......当提供假 URL 时,如何从标头获取真实的 URL

目前,这是我用于读取文件头的代码。我不知道这段代码能否实现我想要的目标。如何从响应标头中解析出“真实”URL?

import httplib
conn = httplib.HTTPConnection(head)
conn.request("HEAD",tail)
res = conn.getresponse()

这有一个 302 重定向: http://www.garageband.com/mp3cat/.UZCMYiqF7Kum /01_No_pierdas_la_fuente_del_gozo.mp3

I notice that sometimes audio files on the internet have a "fake" URL.

http://garagaeband.com/3252243

And this will 302 to the real URL:

http://garageband.com/michael_jackson4.mp3

My question is...when supplied with the fake URL, how can you get the REAL URL from headers?

Currently, this is my code for reading the headers of a file. I don't know if this code will get me what I want to accomplish. How do I parse out the "real" URL From the response headers?

import httplib
conn = httplib.HTTPConnection(head)
conn.request("HEAD",tail)
res = conn.getresponse()

This has a 302 redirect:
http://www.garageband.com/mp3cat/.UZCMYiqF7Kum/01_No_pierdas_la_fuente_del_gozo.mp3

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

黯然#的苍凉 2024-08-18 17:33:12

使用 urllib.getUrl()

编辑:
抱歉,我有一段时间没有这样做了:

import urllib
urllib.urlopen(url).geturl()

例如:

>>> f = urllib2.urlopen("http://tinyurl.com/oex2e")
>>> f.geturl()
'http://www.amazon.com/All-Creatures-Great-Small-Collection/dp/B00006G8FI'
>>> 

Use urllib.getUrl()

edit:
Sorry, I haven't done this in a while:

import urllib
urllib.urlopen(url).geturl()

For example:

>>> f = urllib2.urlopen("http://tinyurl.com/oex2e")
>>> f.geturl()
'http://www.amazon.com/All-Creatures-Great-Small-Collection/dp/B00006G8FI'
>>> 
一梦等七年七年为一梦 2024-08-18 17:33:12

Mark Pilgrim 建议在“httplib2。 org/http-web-services.html#httplib2-redirects" rel="nofollow noreferrer">深入了解 Python3" 因为它以更智能的方式处理许多事情(包括重定向)。

>>> import httplib2
>>> h = httplib2.Http()
>>> response, content = h.request("http://garagaeband.com/3252243")
>>> response["content-location"]
    "http://garageband.com/michael_jackson4.mp3"

Mark Pilgrim advises to use httplib2 in "Dive Into Python3" as it handles many things (including redirects) in a smarter way.

>>> import httplib2
>>> h = httplib2.Http()
>>> response, content = h.request("http://garagaeband.com/3252243")
>>> response["content-location"]
    "http://garageband.com/michael_jackson4.mp3"
她比我温柔 2024-08-18 17:33:12

您必须读取响应,意识到您收到了 302(FOUND),并从响应标头中解析出真实的 URL,然后使用新的 URI 获取资源。

You have to read the response, realize that you got a 302 (FOUND), and parse out the real URL from the response headers, then fetch the resource using the new URI.

椵侞 2024-08-18 17:33:12

我解决了答案。

 import urllib2
    req = urllib2.Request('http://' + theurl)
    opener = urllib2.build_opener()
    f = opener.open(req)
    print 'the real url is......' + f .url

I solved the answer.

 import urllib2
    req = urllib2.Request('http://' + theurl)
    opener = urllib2.build_opener()
    f = opener.open(req)
    print 'the real url is......' + f .url
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文