urllib2 和 httplib 线程安全吗?

发布于 2024-11-03 19:28:49 字数 743 浏览 6 评论 0原文

我正在寻找有关 urllib2 和 httplib 线程安全的信息。 官方文档(http://docs.python.org/library/urllib2.htmlhttp://docs.python.org/library/httplib.html)缺乏有关此主题的任何信息;那里甚至没有提到线程这个词...

更新

好吧,它们不是开箱即用的线程安全的。 需要什么才能使它们成为线程安全的,或者是否存在它们可以成为线程安全的场景? 我问这个问题是因为似乎

  • 使用单独的 OpenerDirector 在每个线程
  • 之间不共享 HTTP 连接 线程

足以在线程中安全地使用这些库。问题urllib2和cookielib线程安全中提出了类似的使用场景

I'm looking for information on thread safety of urllib2 and httplib.
The official documentation (http://docs.python.org/library/urllib2.html and http://docs.python.org/library/httplib.html) lacks any information on this subject; the word thread is not even mentioned there...

UPDATE

Ok, they are not thread-safe out of the box.
What's required to make them thread-safe or is there a scenario in which they can be thread-safe?
I'm asking because it's seems that

  • using separate OpenerDirector in each thread
  • not sharing HTTP connection among
    threads

would suffice to safely use these libs in threads. Similar usage scenario was proposed in question urllib2 and cookielib thread safety

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

为你鎻心 2024-11-10 19:28:49

httpliburllib2线程安全的。

urllib2 不提供对全局(共享)的序列化访问
OpenerDirector 对象,由 urllib2.urlopen() 使用。

同样,httplib 不提供对 HTTPConnection 对象的序列化访问(即通过使用线程安全的连接池),因此在线程之间共享 HTTPConnection 对象不安全。

我建议使用 httplib2urllib3 作为替代方案。

一般来说,如果模块的文档没有提到线程安全,我会认为它不是线程安全的。您可以查看该模块的源代码进行验证。

当浏览源代码以确定模块是否是线程安全时,您
可以从寻找线程同步原语的用途开始
线程多处理模块,或使用queue.Queue

更新

这是来自urllib2.py (Python 2.7.2)的相关源代码片段:

_opener = None
def urlopen(url, data=None, timeout=socket._GLOBAL_DEFAULT_TIMEOUT):
    global _opener
    if _opener is None:
        _opener = build_opener()
    return _opener.open(url, data, timeout)

def install_opener(opener):
    global _opener
    _opener = opener

当并发线程调用install_opener()<时,存在明显的竞争条件/code> 和 urlopen()

另请注意,使用 Request 对象作为 url 参数调用 urlopen() 可能会改变 Request 对象(请参阅 OpenerDirector.open()),因此使用共享 Request 对象同时调用 urlopen() 是不安全的。

总而言之,如果满足以下条件,urlopen() 就是线程安全的:

  • 不从另一个线程调用 install_opener()
  • 非共享 Request 对象或字符串用作 url 参数。

httplib and urllib2 are not thread-safe.

urllib2 does not provide serialized access to a global (shared)
OpenerDirector object, which is used by urllib2.urlopen().

Similarly, httplib does not provide serialized access to HTTPConnection objects (i.e. by using a thread-safe connection pool), so sharing HTTPConnection objects between threads is not safe.

I suggest using httplib2 or urllib3 as an alternative if thread-safety is required.

Generally, if a module's documentation does not mention thread-safety, I would assume it is not thread-safe. You can look at the module's source code for verification.

When browsing the source code to determine whether a module is thread-safe, you
can start by looking for uses of thread synchronization primitives from the
threading or multiprocessing modules, or use of queue.Queue.

UPDATE

Here is a relevant source code snippet from urllib2.py (Python 2.7.2):

_opener = None
def urlopen(url, data=None, timeout=socket._GLOBAL_DEFAULT_TIMEOUT):
    global _opener
    if _opener is None:
        _opener = build_opener()
    return _opener.open(url, data, timeout)

def install_opener(opener):
    global _opener
    _opener = opener

There is an obvious race condition when concurrent threads call install_opener() and urlopen().

Also, note that calling urlopen() with a Request object as the url parameter may mutate the Request object (see the source for OpenerDirector.open()), so it is not safe to concurrently call urlopen() with a shared Request object.

All told, urlopen() is thread-safe if the following conditions are met:

  • install_opener() is not called from another thread.
  • A non-shared Request object, or string is used as the url parameter.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文