线程池类似于多处理池?
是否有一个用于工作线程的Pool类,类似于多处理模块的台球课?
例如,我喜欢并行化映射函数的简单方法
def long_running_func(p):
c_func_no_gil(p)
p = multiprocessing.Pool(4)
xs = p.map(long_running_func, range(100))
,但是我希望在没有创建新进程的开销的情况下做到这一点。
我知道 GIL 的事。然而,在我的用例中,该函数将是一个 IO 绑定的 C 函数,Python 包装器将在实际函数调用之前释放 GIL。
我必须编写自己的线程池吗?
Is there a Pool class for worker threads, similar to the multiprocessing module's Pool class?
I like for example the easy way to parallelize a map function
def long_running_func(p):
c_func_no_gil(p)
p = multiprocessing.Pool(4)
xs = p.map(long_running_func, range(100))
however I would like to do it without the overhead of creating new processes.
I know about the GIL. However, in my usecase, the function will be an IO-bound C function for which the python wrapper will release the GIL before the actual function call.
Do I have to write my own threading pool?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
在 Python 3 中,您可以使用
concurrent.futures.ThreadPoolExecutor
,即:请参阅 文档 了解更多信息和例子。
In Python 3 you can use
concurrent.futures.ThreadPoolExecutor
, i.e.:See the docs for more info and examples.
是的,而且它似乎有(或多或少)相同的 API。
Yes, and it seems to have (more or less) the same API.
非常简单和轻量级的东西(从此处稍作修改):
对于 支持任务完成时的回调,您只需将回调添加到任务元组即可。
For something very simple and lightweight (slightly modified from here):
To support callbacks on task completion you can just add the callback to the task tuple.
您好,要在 Python 中使用线程池,您可以使用这个库:
然后使用这个库,就像这样:
线程是您想要的线程数,任务是最映射到服务的任务列表。
Hi to use the thread pool in Python you can use this library :
and then for use, this library do like that :
The threads are the number of threads that you want and tasks are a list of task that most map to the service.
是的,有一个类似于多处理池的线程池,但是,它有些隐藏并且没有正确记录。您可以通过以下方式导入它:-
我向您展示简单的示例
Yes, there is a threading pool similar to the multiprocessing Pool, however, it is hidden somewhat and not properly documented. You can import it by following way:-
Just I show you simple example
这是我最终使用的结果。它是上面 dgorissen 类的修改版本。
文件:
threadpool.py
使用池
Here's the result I finally ended up using. It's a modified version of the classes by dgorissen above.
File:
threadpool.py
To use the pool
另一种方法是将进程添加到线程队列池中
another way can be adding the process to thethread queue pool
创建新进程的开销很小,尤其是当只有 4 个进程时。我怀疑这是您的应用程序的性能热点。保持简单,优化您需要的地方以及分析结果指向的地方。
The overhead of creating the new processes is minimal, especially when it's just 4 of them. I doubt this is a performance hot spot of your application. Keep it simple, optimize where you have to and where profiling results point to.
没有内置的基于线程的池。然而,使用
Queue
类实现生产者/消费者队列可以非常快。从:
https://docs.python.org/2/library/queue.html
There is no built in thread based pool. However, it can be very quick to implement a producer/consumer queue with the
Queue
class.From:
https://docs.python.org/2/library/queue.html
如果您不介意执行其他人的代码,这是我的:
注意:您可能需要删除许多额外的代码[添加是为了更好地说明和演示其工作原理]
注意:< /strong> Python 命名约定用于方法名称和变量名称,而不是驼峰命名法。
工作过程:
代码:
If you don't mind executing other's code, here's mine:
Note: There is lot of extra code you may want to remove [added for better clarificaiton and demonstration how it works]
Note: Python naming conventions were used for method names and variable names instead of camelCase.
Working procedure:
Code:
我刚刚发现,
multiprocessing
模块中实际上有一个基于线程的 Pool 接口,但它有些隐藏,并且没有正确记录。它可以通过它导入,
它是使用包装 python 线程的虚拟 Process 类实现的。这个基于线程的 Process 类可以在
中找到multiprocessing.dummy
在 文档。据推测,这个虚拟模块提供了基于线程的整个多处理接口。I just found out that there actually is a thread-based Pool interface in the
multiprocessing
module, however it is hidden somewhat and not properly documented.It can be imported via
It is implemented using a dummy Process class wrapping a python thread. This thread-based Process class can be found in
multiprocessing.dummy
which is mentioned briefly in the docs. This dummy module supposedly provides the whole multiprocessing interface based on threads.