如何从 urlib2 请求中获取完整的标头信息？

发布于 2024-11-29 18:24:22 字数 1972 浏览 3 评论 0原文

我正在使用 python urllib2 库来打开 URL，我想要的是获取请求的完整标头信息。当我使用response.info时，我只得到这个：

Date: Mon, 15 Aug 2011 12:00:42 GMT
Server: Apache/2.2.0 (Unix)
Last-Modified: Tue, 01 May 2001 18:40:33 GMT
ETag: "13ef600-141-897e4a40"
Accept-Ranges: bytes
Content-Length: 321
Connection: close
Content-Type: text/html

我期待live_http_headers（firefox的附加组件）给出的完整信息，例如：

http://www.yellowpages.com.mt/Malta-Web/127151.aspx

GET /Malta-Web/127151.aspx HTTP/1.1
Host: www.yellowpages.com.mt
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Cookie: __utma=156587571.1883941323.1313405289.1313405289.1313405289.1;    __utmz=156587571.1313405289.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)

HTTP/1.1 302 Found
Connection: Keep-Alive
Content-Length: 141
Date: Mon, 15 Aug 2011 12:17:25 GMT
Location: http://www.trucks.com.mt
Content-Type: text/html; charset=utf-8
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET, UrlRewriter.NET 2.0.0
X-AspNet-Version: 2.0.50727
Set-Cookie: ASP.NET_SessionId=zhnqh5554omyti55dxbvmf55; path=/; HttpOnly
Cache-Control: private

我的请求功能是：

def dorequest(url, post=None, headers={}):
    cOpener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookielib.CookieJar()))
    urllib2.install_opener( cOpener )
    if post:
        post = urllib.urlencode(post)
    req = urllib2.Request(url, post, headers)
    response   = cOpener.open(req)
    print response.info()  // this does not give complete header info, how can i get complete header info??
    return response.read()
 url = 'http://www.yellowpages.com.mt/Malta-Web/127151.aspx'
 html = dorequest(url)

是否有可能实现所需的使用 urllib2 获取标头信息详细信息？我不想切换到 httplib。

原文

I am using the python urllib2 library for opening URL, and what I want is to get the complete header info of the request. When I use response.info I only get this:

Date: Mon, 15 Aug 2011 12:00:42 GMT
Server: Apache/2.2.0 (Unix)
Last-Modified: Tue, 01 May 2001 18:40:33 GMT
ETag: "13ef600-141-897e4a40"
Accept-Ranges: bytes
Content-Length: 321
Connection: close
Content-Type: text/html

I am expecting the complete info as given by live_http_headers (add-on for firefox), e.g:

http://www.yellowpages.com.mt/Malta-Web/127151.aspx

GET /Malta-Web/127151.aspx HTTP/1.1
Host: www.yellowpages.com.mt
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Cookie: __utma=156587571.1883941323.1313405289.1313405289.1313405289.1;    __utmz=156587571.1313405289.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)

HTTP/1.1 302 Found
Connection: Keep-Alive
Content-Length: 141
Date: Mon, 15 Aug 2011 12:17:25 GMT
Location: http://www.trucks.com.mt
Content-Type: text/html; charset=utf-8
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET, UrlRewriter.NET 2.0.0
X-AspNet-Version: 2.0.50727
Set-Cookie: ASP.NET_SessionId=zhnqh5554omyti55dxbvmf55; path=/; HttpOnly
Cache-Control: private

My request function is:

def dorequest(url, post=None, headers={}):
    cOpener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookielib.CookieJar()))
    urllib2.install_opener( cOpener )
    if post:
        post = urllib.urlencode(post)
    req = urllib2.Request(url, post, headers)
    response   = cOpener.open(req)
    print response.info()  // this does not give complete header info, how can i get complete header info??
    return response.read()
 url = 'http://www.yellowpages.com.mt/Malta-Web/127151.aspx'
 html = dorequest(url)

Is it possible to achieve the desired header info details by using urllib2? I don't want to switch to httplib.

分享到QQ

分享到微博