Python进程池和作用域

发布于 2024-08-06 03:45:13 字数 1070 浏览 2 评论 0原文

我正在尝试在循环中运行自动生成的代码（可能不会终止），以进行遗传编程。我正在尝试为此使用多处理池，因为我不希望每次创建新进程带来巨大的性能开销，并且如果它运行太长时间，我可以终止池进程（我无法使用线程执行此操作）。

本质上，我的程序是

if __name__ == '__main__':    
    pool = Pool(processes=1)            
    while ...:
        source = generate() #autogenerate code
        exec(source)
        print meth() # just a test, prints a result, since meth was defined in source
        result = pool.apply_async(meth)
        try:
            print result.get(timeout=3)  
        except:
           pool.terminate()

这是应该工作的代码，但没有，相反，我得到

AttributeError: 'module' object has no attribute 'meth'

似乎 Pool 只看到 meth 方法，如果它是在最顶层定义的。有什么建议如何让它运行动态创建的方法吗？

编辑：问题与 Process 完全相同，即可以

source = generated()
exec(source)
if __name__ == '__main__':    
    p = Process(target = meth)
    p.start()

工作，但

if __name__ == '__main__':    
    source = generated()
    exec(source)
    p = Process(target = meth)
    p.start()

不能，并且失败并出现 AttributeError

原文

I am trying to run autogenerated code (which might potentially not terminate) in a loop, for genetic programming. I'm trying to use multiprocessing pool for this, since I don't want the big performance overhead of creating a new process each time, and I can terminate the pool process if it runs too long (which i cant do with threads).

Essentially, my program is

if __name__ == '__main__':    
    pool = Pool(processes=1)            
    while ...:
        source = generate() #autogenerate code
        exec(source)
        print meth() # just a test, prints a result, since meth was defined in source
        result = pool.apply_async(meth)
        try:
            print result.get(timeout=3)  
        except:
           pool.terminate()

This is the code that should work, but doesn't, instead i get

AttributeError: 'module' object has no attribute 'meth'

It seems that Pool only sees the meth method, if it is defined in the very top level. Any suggestions how to get it to run dynamically created method?

Edit:
the problem is exactly the same with Process, i.e.

source = generated()
exec(source)
if __name__ == '__main__':    
    p = Process(target = meth)
    p.start()

works, while

if __name__ == '__main__':    
    source = generated()
    exec(source)
    p = Process(target = meth)
    p.start()

doesn't, and fails with an AttributeError

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

行雁书 2024-08-13 03:45:13

您是否阅读过编程指南？里面有很多关于全局变量的内容。 Windows 下还有更多限制。您没有说明您正在哪个平台上运行，但如果您在 Windows 下运行，这可能是问题所在。从上面的链接

全局变量
请记住，如果在子进程中运行的代码尝试访问全局变量，那么它看到的值（如果有）可能与 Process.start( 时父进程中的值不同) ) 被调用。
但是，作为模块级常量的全局变量不会导致任何问题。

回复收藏 0 原文

一梦浮鱼 2024-08-13 03:45:13

Process（通过池或其他方式）不会有 __name__ 为 '__main__'，因此它不会执行任何依赖于该条件的内容-- 当然，包括您用来查找 meth 所依赖的 exec 语句。

为什么你如此热衷于让 exec 由一个条件来保护，根据设计，在你的子流程中 IS 将是 false，但又让该子流程依赖（矛盾！）执行 exec...？！实在是让我费解...

回复收藏 0 原文

世态炎凉 2024-08-13 03:45:13

正如我上面评论的，您的所有示例都在我的 Linux 机器上按您的预期运行（Debian Lenny，Python2.5，处理 0.52，请参阅下面的测试代码）。

对于可以从一个进程传输到另一个进程的对象，窗口似乎有很多限制。阅读 Nick 指出的文档，似乎在窗口上缺少 fork 的操作系统将运行一个全新的 python 解释器导入模块和应该传递的 pickle/unpickle 对象。如果它们不能被腌制，我希望你会遇到你遇到的那种问题。

因此，完整的（非）工作示例可能对诊断有用。答案可能就在你隐藏的无关紧要的事情中。

from processing import Pool
import os

def generated():
    return (
"""
def meth():
    import time
    starttime = time.time()
    pid = os.getpid()
    while 1:
        if time.time() - starttime > 1:
            print "process %s" % pid
            starttime = starttime + 1

""")


if __name__ == '__main__':
    pid = os.getpid()
    print "main pid=%s" % pid
    for n in range(5):
        source = generated() #autogenerate code
        exec(source)
        pool = Pool(processes=1)            
        result = pool.apply_async(meth)
        try:
            print result.get(timeout=3)  
        except:
           pool.terminate()

另一个建议是使用线程。 是的，您可以，即使您不知道生成的代码是否会停止，或者生成的代码是否具有不同的嵌套循环。循环根本没有限制，这正是使用生成器（提取控制流）的要点。我不明白为什么它不能适用于你正在做的事情。 [同意这可能比独立进程有更多的变化]请参见下面的示例。

import time

class P(object):
    def __init__(self, name):
        self.name = name
        self.starttime = time.time()
        self.lastexecutiontime = self.starttime
        self.gen = self.run()

    def toolong(self):
        if time.time() - self.starttime > 10:
            print "process %s too long" % self.name
            return True
        return False

class P1(P):
    def run(self):
        for x in xrange(1000):
            for y in xrange(1000):
                for z in xrange(1000):
                    if time.time() - self.lastexecutiontime > 1:
                        print "process %s" % self.name
                        self.lastexecutiontime = self.lastexecutiontime + 1
                        yield
        self.result = self.name.uppercase()

class P2(P):
    def run(self):
        for x in range(10000000):
            if time.time() - self.lastexecutiontime > 1:
                print "process %s" % self.name
                self.lastexecutiontime = self.lastexecutiontime + 1
                yield
        self.result = self.name.capitalize()

pool = [P1('one'), P1('two'), P2('three')]
while len(pool) > 0:
    current = pool.pop()
    try:
        current.gen.next()
    except StopIteration:
        print "Thread %s ended. Result '%s'" % (current.name, current.result) 
    else:
        if current.toolong():
            print "Forced end for thread %s" % current.name 
        else:
            pool.insert(0, current)

As I commented above, all your examples are working as you expect on my Linux box (Debian Lenny, Python2.5, processing 0.52, see test code below).

There seems to be many restrictions on windows on objects you can transmit from one process to another. Reading the doc pointed out by Nick it seems that on window the os missing fork it will run a brand new python interpreter import modules and pickle/unpickle objects that should be passed around. If they can't be pickled I expect that you'll get the kind of problem that occured to you.

Hence a complete (not) working example may be usefull for diagnosis. The answer may be in the things you've hidden as irrelevant.

from processing import Pool
import os

def generated():
    return (
"""
def meth():
    import time
    starttime = time.time()
    pid = os.getpid()
    while 1:
        if time.time() - starttime > 1:
            print "process %s" % pid
            starttime = starttime + 1

""")


if __name__ == '__main__':
    pid = os.getpid()
    print "main pid=%s" % pid
    for n in range(5):
        source = generated() #autogenerate code
        exec(source)
        pool = Pool(processes=1)            
        result = pool.apply_async(meth)
        try:
            print result.get(timeout=3)  
        except:
           pool.terminate()

Another suggestion would be to use threads. yes you can even if you don't know if your generated code will stop or not or if your generated code have differently nested loops. Loops are no restriction at all, that's precisely a point for using generators (extracting control flow). I do not see why it couldn't apply to what you are doing. [Agreed it is probably more change that independent processes] See example below.

import time

class P(object):
    def __init__(self, name):
        self.name = name
        self.starttime = time.time()
        self.lastexecutiontime = self.starttime
        self.gen = self.run()

    def toolong(self):
        if time.time() - self.starttime > 10:
            print "process %s too long" % self.name
            return True
        return False

class P1(P):
    def run(self):
        for x in xrange(1000):
            for y in xrange(1000):
                for z in xrange(1000):
                    if time.time() - self.lastexecutiontime > 1:
                        print "process %s" % self.name
                        self.lastexecutiontime = self.lastexecutiontime + 1
                        yield
        self.result = self.name.uppercase()

class P2(P):
    def run(self):
        for x in range(10000000):
            if time.time() - self.lastexecutiontime > 1:
                print "process %s" % self.name
                self.lastexecutiontime = self.lastexecutiontime + 1
                yield
        self.result = self.name.capitalize()

pool = [P1('one'), P1('two'), P2('three')]
while len(pool) > 0:
    current = pool.pop()
    try:
        current.gen.next()
    except StopIteration:
        print "Thread %s ended. Result '%s'" % (current.name, current.result) 
    else:
        if current.toolong():
            print "Forced end for thread %s" % current.name 
        else:
            pool.insert(0, current)

回复收藏 0 原文

~没有更多了~