当前位置：文江博客话题详情

Python multiprocessing

'if name == "main":' 后面的任何内容不执行

发布于 2025-01-02 16:45:40 字数 1936 浏览 1 评论 0 原文

所以，这是我的情况。

我在 Eclipse 中使用 PyDev，在 Windows 中使用 Python 解释器版本 2.7.2。

我正在使用内置的多处理库来尝试分叉一堆进程来并行化一个计算密集型循环。我看过的教程说要使用它，

if __name__ == "__main__":

以防止它产生近乎无限的进程并从本质上使我的系统崩溃。

问题是，我是从模块调用它，而不是我的主脚本；因此，执行后什么也没有。根本没有并行的机会。当然，如果我删除它，我会收到无限进程垃圾邮件，它会杀死执行代码的机器。

为了便于参考，这里是相关代码：

from tribe import DataCache
from tribe import WorldThread
from tribe import Actor
from time import sleep
import multiprocessing

class World:
def __init__(self,numThreads,numActors,tickRate):
    print "Initalizing world..."
    self.cache = DataCache.DataCache()
    self.numThreads = numThreads
    self.numActors = numActors
    self.tickRate = tickRate
    self.actors = []
    self.processes = []
    for i in range(numActors):
        self.actors.append(Actor.Actor("test.xml",self.cache))
    print "Actors loaded."
def start_world(self):
    print "Starting world"
    run_world = True;
    while run_world:
        self.world_tick()
        sleep(2)

def world_tick(self):
        if __name__ == '__main__':
            print "World tick"
            actor_chunk = len(self.actors)/self.numThreads
            if len(self.processes)==0:
                for _ in range(self.numThreads):
                    new_process = multiprocessing.Process(WorldThread.WorldProcess.work, args=(_, self.actors[_*actor_chunk,(_+1)*actor_chunk]))

以及它正在调用的类：

class WorldProcess():
def __init__(self):
    print "World process initilized."
    ''' Really, I'm not sure what kind of setup we'll be doing here yet. '''
def work(self, process_number, actors):
    print "World process" + str(process_number) + " running."
    for actor in actors:
        actor.tick()
    print "World process" + str(process_number) + " completed."

我的评估是否正确，整个 if name == "main": check 仅在可执行文件中包含它时才有效脚本本身？如果是这样，如何从模块内安全地分叉进程？如果没有，为什么它在这里不起作用？

原文

So, here's my situation.

I'm using PyDev in Eclipse, Python interpreter version 2.7.2 in Windows.

I'm using the built in multiprocessing library in an attempt to fork off a bunch of processes to parallelize a very compute-intensive loop. The tutorials I've looked at say to use,

if __name__ == "__main__":

to prevent it from spawning off near-infinite processes and bringing my system to its knees, essentially.

The problem is, I am calling this from a module, not my main script; as such, nothing after it EVER gets executed. No chance for parallelism at all. Of course, if I remove it, I get the infiniprocess spam that kills the machine executing the code.

For reference's sake, here's the relevant code:

from tribe import DataCache
from tribe import WorldThread
from tribe import Actor
from time import sleep
import multiprocessing

class World:
def __init__(self,numThreads,numActors,tickRate):
    print "Initalizing world..."
    self.cache = DataCache.DataCache()
    self.numThreads = numThreads
    self.numActors = numActors
    self.tickRate = tickRate
    self.actors = []
    self.processes = []
    for i in range(numActors):
        self.actors.append(Actor.Actor("test.xml",self.cache))
    print "Actors loaded."
def start_world(self):
    print "Starting world"
    run_world = True;
    while run_world:
        self.world_tick()
        sleep(2)

def world_tick(self):
        if __name__ == '__main__':
            print "World tick"
            actor_chunk = len(self.actors)/self.numThreads
            if len(self.processes)==0:
                for _ in range(self.numThreads):
                    new_process = multiprocessing.Process(WorldThread.WorldProcess.work, args=(_, self.actors[_*actor_chunk,(_+1)*actor_chunk]))

And the class it is calling:

class WorldProcess():
def __init__(self):
    print "World process initilized."
    ''' Really, I'm not sure what kind of setup we'll be doing here yet. '''
def work(self, process_number, actors):
    print "World process" + str(process_number) + " running."
    for actor in actors:
        actor.tick()
    print "World process" + str(process_number) + " completed."

Am I correct in my assessment that the whole if name == "main": check only works if you have it in the executable script itself? If so, how do you safely fork off processes from within modules? If not, why isn't it working here?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

柠檬色的秋千 2025-01-09 16:45:40

添加此作为答案，因为它在注释中：

if __name__ == "__main__" 是您在将成为入口点的脚本的根级别执行的操作。这是一种仅在直接执行脚本时才执行操作的方法。

如果您有一个作为入口点的脚本，则执行 name == main。在您想要进行多进程的模块中，您只需以与循环和启动线程相同的方式循环和启动进程即可。

回复收藏 0 原文

不羁少年 2025-01-09 16:45:40

要控制进程数量，请使用 multiprocessing 中的 Pool 类：（

from multiprocessing import Pool
p = Pool(5)
def f(x):
     return x*x
p.map(f, [1,2,3])

编辑：根据评论，这只是 Pool 类的操作方法 .查看更多）

不需要使用__name__，因为您显式传递了 Process要运行的实际Python函数。

这：

def world_tick(self):
    if __name__ == '__main__':
        print "World tick"
        actor_chunk = len(self.actors)/self.numThreads
        if len(self.processes)==0:
            for _ in range(self.numThreads):
                new_process = multiprocessing.Process(WorldThread.WorldProcess.work, args=(_, self.actors[_*actor_chunk,(_+1)*actor_chunk]))

非常糟糕。简化一下。

更好的模式是：

class WorkArgs(object):
    ... many attributes follow ...

def proc_work(world_thread, work_args):
    world_thread.WorldProcess.work(work_args.a, work_args.b, ... etc)

p = Pool(5)
p.map(proc_work, [(world_thread, args0), (world_thread, args1), ...])

希望这有帮助！

附带说明一下，腌制您的参数并将它们传递给其他进程将导致导入您的模块。因此，最好确保您的模块不会执行任何分叉/魔术/工作，除非被告知（例如，只有函数/类定义或 __name__ 魔术，而不是实际的代码块）。

To control the amount of processes, use the Pool class from multiprocessing:

from multiprocessing import Pool
p = Pool(5)
def f(x):
     return x*x
p.map(f, [1,2,3])

(Edit: as per comment, this is just howto for the Pool class. see more)

Using __name__ is not required, since you explicitly pass Process the actual python function to run.

This:

def world_tick(self):
    if __name__ == '__main__':
        print "World tick"
        actor_chunk = len(self.actors)/self.numThreads
        if len(self.processes)==0:
            for _ in range(self.numThreads):
                new_process = multiprocessing.Process(WorldThread.WorldProcess.work, args=(_, self.actors[_*actor_chunk,(_+1)*actor_chunk]))

is very bad. Simplify it.

A better pattern will be:

class WorkArgs(object):
    ... many attributes follow ...

def proc_work(world_thread, work_args):
    world_thread.WorldProcess.work(work_args.a, work_args.b, ... etc)

p = Pool(5)
p.map(proc_work, [(world_thread, args0), (world_thread, args1), ...])

Hope this helps!

As a side note, pickling your arguments and passing them to other processes will result in importing your module. So, it is best to make sure you module doesn't preform any forking/magic/work unless it is told so (e.g, only has function/class definitions or __name__ magic, not actual code blocks).