当前位置：文江博客话题详情

Python - 使用 HTTPS 的 urllib2 异步/线程请求示例

发布于 2024-11-03 19:55:50 字数 147 浏览 2 评论 0原文

我很难使用 Python 的 urllib2 来获取异步/线程 HTTPS 请求。

有没有人有一个实现 urllib2.Request、urllib2.build_opener 和 urllib2.HTTPSHandler 子类的基本示例？

谢谢！

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

七秒鱼° 2024-11-10 19:55:50

下面的代码同时异步执行 7 个 http 请求。
它不使用线程，而是使用 twisted 库的异步网络。

from twisted.web import client
from twisted.internet import reactor, defer

urls = [
 'http://www.python.org', 
 'http://stackoverflow.com', 
 'http://www.twistedmatrix.com', 
 'http://www.google.com',
 'http://launchpad.net',
 'http://github.com',
 'http://bitbucket.org',
]

def finish(results):
    for result in results:
        print 'GOT PAGE', len(result), 'bytes'
    reactor.stop()

waiting = [client.getPage(url) for url in urls]
defer.gatherResults(waiting).addCallback(finish)

reactor.run()

The code below does 7 http requests asynchronously at the same time.
It does not use threads, instead it uses asynchronous networking with the twisted library.

from twisted.web import client
from twisted.internet import reactor, defer

urls = [
 'http://www.python.org', 
 'http://stackoverflow.com', 
 'http://www.twistedmatrix.com', 
 'http://www.google.com',
 'http://launchpad.net',
 'http://github.com',
 'http://bitbucket.org',
]

def finish(results):
    for result in results:
        print 'GOT PAGE', len(result), 'bytes'
    reactor.stop()

waiting = [client.getPage(url) for url in urls]
defer.gatherResults(waiting).addCallback(finish)

reactor.run()

回复收藏 0 原文

三生一梦 2024-11-10 19:55:50

有一个非常简单的方法，涉及 urllib2 的处理程序，您可以在这里找到它： http: //pythonquirks.blogspot.co.uk/2009/12/asynchronous-http-request.html

#!/usr/bin/env python

import urllib2
import threading

class MyHandler(urllib2.HTTPHandler):
    def http_response(self, req, response):
        print "url: %s" % (response.geturl(),)
        print "info: %s" % (response.info(),)
        for l in response:
            print l
        return response

o = urllib2.build_opener(MyHandler())
t = threading.Thread(target=o.open, args=('http://www.google.com/',))
t.start()
print "I'm asynchronous!"

t.join()

print "I've ended!"

there's a really simple way, involving a handler for urllib2, which you can find here: http://pythonquirks.blogspot.co.uk/2009/12/asynchronous-http-request.html

#!/usr/bin/env python

import urllib2
import threading

class MyHandler(urllib2.HTTPHandler):
    def http_response(self, req, response):
        print "url: %s" % (response.geturl(),)
        print "info: %s" % (response.info(),)
        for l in response:
            print l
        return response

o = urllib2.build_opener(MyHandler())
t = threading.Thread(target=o.open, args=('http://www.google.com/',))
t.start()
print "I'm asynchronous!"

t.join()

print "I've ended!"

回复收藏 0 原文

月亮邮递员 2024-11-10 19:55:50

这是使用 urllib2 （带有 https）和线程的示例。每个线程循环访问 URL 列表并检索资源。

import itertools
import urllib2
from threading import Thread


THREADS = 2
URLS = (
    'https://foo/bar',
    'https://foo/baz',
    )


def main():
    for _ in range(THREADS):
        t = Agent(URLS)
        t.start()


class Agent(Thread):
    def __init__(self, urls):
        Thread.__init__(self)
        self.urls = urls

    def run(self):
        urls = itertools.cycle(self.urls)
        while True:
            data = urllib2.urlopen(urls.next()).read()


if __name__ == '__main__':
    main()

here is an example using urllib2 (with https) and threads. Each thread cycles through a list of URL's and retrieves the resource.

import itertools
import urllib2
from threading import Thread


THREADS = 2
URLS = (
    'https://foo/bar',
    'https://foo/baz',
    )


def main():
    for _ in range(THREADS):
        t = Agent(URLS)
        t.start()


class Agent(Thread):
    def __init__(self, urls):
        Thread.__init__(self)
        self.urls = urls

    def run(self):
        urls = itertools.cycle(self.urls)
        while True:
            data = urllib2.urlopen(urls.next()).read()


if __name__ == '__main__':
    main()

回复收藏 0 原文

樱花细雨 2024-11-10 19:55:50

您可以使用异步 IO 来执行此操作。

请求 + gevent = grequests

GRequests 允许您使用带有 Gevent 的请求来异步HTTP轻松提出要求。

import grequests

urls = [
    'http://www.heroku.com',
    'http://tablib.org',
    'http://httpbin.org',
    'http://python-requests.org',
    'http://kennethreitz.com'
]

rs = (grequests.get(u) for u in urls)
grequests.map(rs)

You can use asynchronous IO to do this.

requests + gevent = grequests

GRequests allows you to use Requests with Gevent to make asynchronous HTTP Requests easily.

import grequests

urls = [
    'http://www.heroku.com',
    'http://tablib.org',
    'http://httpbin.org',
    'http://python-requests.org',
    'http://kennethreitz.com'
]

rs = (grequests.get(u) for u in urls)
grequests.map(rs)

回复收藏 0 原文

夏了南城 2024-11-10 19:55:50

这是来自 eventlet 的代码

urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
     "https://wiki.secondlife.com/w/images/secondlife.jpg",
     "http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]

import eventlet
from eventlet.green import urllib2

def fetch(url):

  return urllib2.urlopen(url).read()

pool = eventlet.GreenPool()

for body in pool.imap(fetch, urls):
  print "got body", len(body)

here is the code from eventlet

urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
     "https://wiki.secondlife.com/w/images/secondlife.jpg",
     "http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]

import eventlet
from eventlet.green import urllib2

def fetch(url):

  return urllib2.urlopen(url).read()

pool = eventlet.GreenPool()

for body in pool.imap(fetch, urls):
  print "got body", len(body)

回复收藏 0 原文

~没有更多了~