Python 多处理 - 只是不明白
我花了一些时间试图理解多重处理,尽管我未经训练的头脑无法理解它的细节。我已经能够让一个池返回一个简单的整数,但是如果该函数不仅仅返回像我能找到的所有示例那样的结果(即使在 文档,这是一些我不太理解的晦涩示例。
这是我正在尝试工作的示例。但是,我可以'让它按预期工作,并且我确信我可能需要使用队列或共享内存或管理器的原因很简单,但是当我阅读文档很多次时,我似乎无法理解它的实际含义和含义。到目前为止我所了解的只是池函数。
另外,我正在使用一个类,因为我需要避免使用 这个问题的答案
import random
class thisClass:
def __init__(self):
self.i = 0
def countSixes(myClassObject):
newNum = random.randrange(0,10)
#print(newNum) #this proves the function is being run if enabled
if newNum == 6:
myClassObject.i += 1
if __name__ == '__main__':
import multiprocessing
pool = multiprocessing.Pool(1) #use one core for now
counter = thisClass()
myList = []
[myList.append(x) for x in range(1000)]
#it must be (args,) instead of just i, apparently
async_results = [pool.apply_async(countSixes, (counter,)) for i in myList]
for x in async_results:
x.get(timeout=1)
print(counter.i)
有人可以用愚蠢的方式解释需要做什么,这样我才能最终明白我错过了什么。它有什么作用?
I've been spending some time trying to understand multiprocessing, though its finer points evade my untrained mind. I've been able to get a pool to return a simple integer, but if the function doesn't just return a result like all of the examples I can find (even in the documentation, it's some obscure example I can't quite understand.
Here is an example I'm trying to get working. BUT, I can't get it working as intended, and I'm sure there's a simple reason why. I may need to use a queue or shared memory or a manager, but as many times as I read the documentation I can't seem to wrap my brain around what it actually means and what it does. All I've been able to get an understanding of so far is the pool function.
Also, I'm using a class as I need to avoid using global variables as in this question's answer.
import random
class thisClass:
def __init__(self):
self.i = 0
def countSixes(myClassObject):
newNum = random.randrange(0,10)
#print(newNum) #this proves the function is being run if enabled
if newNum == 6:
myClassObject.i += 1
if __name__ == '__main__':
import multiprocessing
pool = multiprocessing.Pool(1) #use one core for now
counter = thisClass()
myList = []
[myList.append(x) for x in range(1000)]
#it must be (args,) instead of just i, apparently
async_results = [pool.apply_async(countSixes, (counter,)) for i in myList]
for x in async_results:
x.get(timeout=1)
print(counter.i)
Can someone explain in dumb-dumb what needs to be done so I can finally understand what I'm missing and what it does?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我花了一段时间才明白你想要发生什么。该问题与多处理的工作方式有关。基本上,您需要以函数式风格编写程序,而不是像现在那样依赖副作用。
现在,您将对象发送到池中进行修改,并且从
countSixes
中不返回任何内容。这不适用于多处理,因为为了避开 GIL,多处理会创建一个 复制计数器
并将其发送到全新的解释器。因此,当您递增i
时,您实际上是递增了i
的副本,然后,因为您没有返回任何内容,所以您正在丢弃它!要执行一些有用的操作,您必须从
countSixes
返回一些内容。这是代码的简化版本,其功能与您想要的类似。我留下了一个参数,只是为了展示你应该做什么,但实际上这可以通过零参数函数来完成。It took me a while to understand what you want to happen. The problem has to do with the way multiprocessing works. Basically, you need to write your program in a functional style, instead of relying on side-effects as you do now.
Right now, you're sending out objects to your pool to be modified and returning nothing from
countSixes
. That won't work with multiprocessing, because in order to sidestep the GIL, multiprocessing creates a copy ofcounter
and sends it to a brand new interpreter. So when you incrementi
, you're actually incrementing a copy ofi
, and then, because you return nothing, you are discarding it!To do something useful, you have to return something from
countSixes
. Here's a simplified version of your code that does something similar to what you want. I left an argument in, just to show what you ought to be doing, but really this could be done with a zero-arg function.