一个让我知道至少 1 个线程何时完成的线程池?
我需要在 python 中使用线程池,并且我希望能够知道至少 1 个线程或“允许的最大线程”何时完成,这样如果我仍然需要做某事,我可以再次启动它。
我一直在使用这样的东西:
def doSomethingWith(dataforthread):
dostuff()
i = i-1 #thread has finished
i = 0
poolSize = 5
threads = []
data = #array of data
while len(data):
while True:
if i<poolSize: #if started threads is < poolSize start new thread
dataforthread = data.pop(0)
i = i+1
thread = doSomethingWith(dataforthread)
thread.start()
threads.append(thread)
else:
break
for t in threads: #wait for ALL threads (I ONLY WANT TO WAIT FOR 1 [any])
t.join()
据我所知,我的代码打开 5 个线程,然后等待所有线程完成,然后再启动新线程,直到消耗数据。但我真正想做的是一旦其中一个线程完成并且池有一个新线程的“可用位置”就启动一个新线程。
我一直在阅读this,但我认为这会与我的代码有相同的问题(不确定,我是 python 新手,但通过查看 joinAll() 看起来就像这样)。
有人有一个例子可以实现我想要实现的目标吗?
我的意思是一旦我就进行检测比 poolSize 启动新线程直到 i=poolSize 并执行此操作直到数据被消耗。
I need to use a thread pool in python, and I want to be able to know when at least 1 thead out or "maximum threads allowed" has finished, so I can start it again if I still need to do something.
I has been using something like this:
def doSomethingWith(dataforthread):
dostuff()
i = i-1 #thread has finished
i = 0
poolSize = 5
threads = []
data = #array of data
while len(data):
while True:
if i<poolSize: #if started threads is < poolSize start new thread
dataforthread = data.pop(0)
i = i+1
thread = doSomethingWith(dataforthread)
thread.start()
threads.append(thread)
else:
break
for t in threads: #wait for ALL threads (I ONLY WANT TO WAIT FOR 1 [any])
t.join()
As I understand, my code opens 5 threads, and then waits for all the threads to finish before starting new threads, until data is consumed. But what I really want to do is start a new thread as soon as one of the threads finish and the pool has an "available spot" for a new thread.
I have been reading this, but I think that would have the same issue than my code (not sure, im new to python but by looking at joinAll() it looks like that).
Does someone has an example to do what I am trying to achieve?
I mean detecting as soon as i is < than poolSize, launching new threads until i=poolSize and do that until data is consumed.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
正如文章作者提到的,以及 @getekha 强调的,Python 中的线程池并不能完成与其他语言中完全相同的事情。如果您需要并行性,您应该查看
multiprocessing
模块。除此之外,它还有这些方便的Queue
和池
构建。此外,还有公认的“期货”PEP,您可能会使用它想要监控。As the article author mentions, and @getekha highlights, thread pools in Python don't accomplish exactly the same thing as they do in other languages. If you need parallelism, you should look into the
multiprocessing
module. Among other things, it has these handyQueue
andPool
constructs. Also, there's an accepted PEP for "futures" that you'll probably want to monitor.问题是Python有一个全局解释器锁,必须持有它才能运行任何Python代码。这意味着任何时候只有一个线程可以执行Python代码,因此Python中的线程池与其他语言中的线程池不一样。这主要是出于只有少数人知道的神秘原因(即它很复杂)。
如果你确实想异步运行代码,你应该生成新的进程;
multiprocesssing
模块有一个Pool
类,您可以查看一下。The problem is that Python has a Global Interpreter Lock, which must be held to run any Python code. This means that only one thread can execute Python code at any time, so thread pools in Python are not the same as in other languages. This is mainly for arcane reasons known only to a select few (i.e. it's complicated).
If you really want to run code asynchronously, you should spawn new Processes; the
multiprocesssing
module has aPool
class which you could look into.