当前位置：文江博客话题详情

python 中的多进程还是线程？

发布于 2024-07-30 18:23:25 字数 180 浏览 8 评论 0原文

我有一个 python 应用程序，它获取数据集合，并针对该集合中的每条数据执行一项任务。由于存在延迟，该任务需要一些时间才能完成。由于这种延迟，我不希望每条数据都随后执行任务，我希望它们全部并行发生。我应该使用多进程吗？或此操作的线程？

我尝试使用线程，但遇到了一些麻烦，通常某些任务永远不会真正触发。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

琉璃梦幻 2024-08-06 18:23:25

如果您确实受计算限制，那么使用多处理模块可能是最好的选择最轻量级的解决方案（就内存消耗和实现难度而言）。

如果您受 I/O 限制，请使用线程模块通常会给你很好的结果。确保使用线程安全存储（如队列）将数据传递给线程。或者，在它们产生时，向它们提供一条对它们来说是唯一的数据。

PyPy 专注于性能。它具有许多有助于计算密集型处理的功能。他们还支持软件事务内存，尽管这还没有达到生产质量。我们承诺您可以使用比多处理更简单的并行或并发机制（这有一些尴尬的要求。）

Stackless Python 也是一个好主意。 Stackless 存在如上所述的可移植性问题。 Unladen Swallow 曾经很有希望，但现在已经不存在了。 Pyston 是另一个（未完成的）注重速度的 Python 实现。它采用了与 PyPy 不同的方法，这可能会产生更好（或只是不同）的加速。

回复收藏 0 原文

温柔嚣张 2024-08-06 18:23:25

任务像顺序运行一样，但你会有并行运行的错觉。当您用于文件或连接 I/O 时，任务是很好的选择，因为它是轻量级的。

带池的多进程可能是适合您的解决方案，因为进程并行运行，因此非常适合密集计算，因为每个进程都在一个 CPU（或核心）中运行。

设置多进程可能非常简单：

from multiprocessing import Pool

def worker(input_item):
    output = do_some_work()
    return output

pool = Pool() # it make one process for each CPU (or core) of your PC. Use "Pool(4)" to force to use 4 processes, for example.
list_of_results = pool.map(worker, input_list) # Launch all automatically

Tasks runs like sequentially but you have the illusion that are run in parallel. Tasks are good when you use for file or connection I/O and because are lightweights.

Multiprocess with Pool may be the right solution for you because processes runs in parallel so are very good with intensive computing because each process run in one CPU (or core).

Setup multiprocess may be very easy:

from multiprocessing import Pool

def worker(input_item):
    output = do_some_work()
    return output

pool = Pool() # it make one process for each CPU (or core) of your PC. Use "Pool(4)" to force to use 4 processes, for example.
list_of_results = pool.map(worker, input_list) # Launch all automatically

回复收藏 0 原文