如何在 Python 多处理中动态创建每个进程队列
我想动态创建多个进程,其中每个实例都有一个队列,用于接收来自其他实例的传入消息,并且每个实例还可以创建新实例。所以我们最终得到了一个进程网络,所有进程都互相发送。每个实例都可以发送给其他实例。
下面的代码可以实现我想要的功能:它使用 Manager.dict()
来存储队列,确保传播更新,并使用 Lock()
保护对队列的写访问。但是,当添加新队列时,它会抛出“RuntimeError:队列对象只能通过继承在进程之间共享”
。
问题是,在启动时,我们不知道最终需要多少个队列,所以我们必须动态创建它们。但由于除了在构建时之外我们无法共享队列,所以我不知道该怎么做。
我知道一种可能性是使 queues
成为一个全局变量,而不是传递给 __init__
的托管变量:据我所知,问题在于添加到 queues
变量不会传播到其他进程。
编辑我正在研究进化算法。 EA 是一种机器学习技术。 EA 模拟“群体”,通过适者生存、交叉和突变而进化。在并行 EA 中,就像这里一样,我们也有群体之间的迁移,对应于进程间通信。岛屿也可以产生新的岛屿,因此我们需要一种在动态创建的进程之间发送消息的方法。
import random, time
from multiprocessing import Process, Queue, Lock, Manager, current_process
try:
from queue import Empty as EmptyQueueException
except ImportError:
from Queue import Empty as EmptyQueueException
class MyProcess(Process):
def __init__(self, queues, lock):
super(MyProcess, self).__init__(target=lambda x: self.run(x),
args=tuple())
self.queues = queues
self.lock = lock
# acquire lock and add a new queue for this process
with self.lock:
self.id = len(list(self.queues.keys()))
self.queues[self.id] = Queue()
def run(self):
while len(list(self.queues.keys())) < 10:
# make a new process
new = MyProcess(self.lock)
new.start()
# send a message to a random process
dest_key = random.choice(list(self.queues.keys()))
dest = self.queues[dest_key]
dest.put("hello to %s from %s" % (dest_key, self.id))
# receive messages
message = True
while message:
try:
message = self.queues[self.id].get(False) # don't block
print("%s received: %s" % (self.id, message))
except EmptyQueueException:
break
# what queues does this process know about?
print("%d: I know of %s" %
(self.id, " ".join([str(id) for id in self.queues.keys()])))
time.sleep(1)
if __name__ == "__main__":
# Construct MyProcess with a Manager.dict for storing the queues
# and a lock to protect write access. Start.
MyProcess(Manager().dict(), Lock()).start()
I want to dynamically create multiple Process
es, where each instance has a queue for incoming messages from other instances, and each instance can also create new instances. So we end up with a network of processes all sending to each other. Every instance is allowed to send to every other.
The code below would do what I want: it uses a Manager.dict()
to store the queues, making sure updates are propagated, and a Lock()
to protect write-access to the queues. However when adding a new queue it throws "RuntimeError: Queue objects should only be shared between processes through inheritance"
.
The problem is that when starting-up, we don't know how many queues will eventually be needed, so we have to create them dynamically. But since we can't share queues except at construction time, I don't know how to do that.
I know that one possibility would be to make queues
a global variable instead of a managed one passed-in to __init__
: the problem then, as I understand it, is that additions to the queues
variable wouldn't be propagated to other processes.
EDIT I'm working on evolutionary algorithms. EAs are a type of machine learning technique. An EA simulates a "population", which evolves by survival of the fittest, crossover, and mutation. In parallel EAs, as here, we also have migration between populations, corresponding to interprocess communication. Islands can also spawn new islands, and so we need a way to send messages between dynamically-created processes.
import random, time
from multiprocessing import Process, Queue, Lock, Manager, current_process
try:
from queue import Empty as EmptyQueueException
except ImportError:
from Queue import Empty as EmptyQueueException
class MyProcess(Process):
def __init__(self, queues, lock):
super(MyProcess, self).__init__(target=lambda x: self.run(x),
args=tuple())
self.queues = queues
self.lock = lock
# acquire lock and add a new queue for this process
with self.lock:
self.id = len(list(self.queues.keys()))
self.queues[self.id] = Queue()
def run(self):
while len(list(self.queues.keys())) < 10:
# make a new process
new = MyProcess(self.lock)
new.start()
# send a message to a random process
dest_key = random.choice(list(self.queues.keys()))
dest = self.queues[dest_key]
dest.put("hello to %s from %s" % (dest_key, self.id))
# receive messages
message = True
while message:
try:
message = self.queues[self.id].get(False) # don't block
print("%s received: %s" % (self.id, message))
except EmptyQueueException:
break
# what queues does this process know about?
print("%d: I know of %s" %
(self.id, " ".join([str(id) for id in self.queues.keys()])))
time.sleep(1)
if __name__ == "__main__":
# Construct MyProcess with a Manager.dict for storing the queues
# and a lock to protect write access. Start.
MyProcess(Manager().dict(), Lock()).start()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我不完全确定您的用例实际上是什么。也许如果您详细说明为什么要让每个进程动态生成一个具有连接队列的子进程,那么在这种情况下正确的解决方案是什么会更清楚。
无论如何,就目前的问题而言,目前似乎没有一个真正好的方法可以使用多处理动态创建管道或队列。
我认为,如果您愿意在每个进程中生成线程,您也许可以使用
multiprocessing.connection.Listener/Client
来回通信。我没有生成线程,而是采用了使用网络套接字并选择在线程之间进行通信的方法。动态进程生成和网络套接字可能仍然不稳定,具体取决于多处理在生成/分叉新进程时如何清理文件描述符,并且您的解决方案很可能在 *nix 衍生品上更容易工作。如果您担心套接字开销,您可以使用 unix 域套接字来变得更轻量,但代价是增加在多个工作计算机上运行节点的复杂性。
无论如何,这里有一个使用网络套接字和全局进程列表来完成此操作的示例,因为我无法找到使
多处理
执行此操作的好方法。经过大量的打磨和测试,这可能是
multiprocessing.Process
和/或multiprocessing.Pool
的逻辑扩展,因为这看起来确实是人们会使用的东西。在标准库中可用。创建一个使用可供其他队列发现的套接字的 DynamicQueue 类也可能是合理的。无论如何,希望它有所帮助。如果您找到更好的方法来完成这项工作,请更新。
I'm not entirely sure what your use case actually is here. Perhaps if you elaborate a bit more on why you want to have each process dynamically spawn a child with a connected queue it'll be a bit more clear what the right solution would be in this situation.
Anyway, with the question as is it seems that there is not really a good way to dynamically create pipes or queues with Multiprocessing right now.
I think that if you're willing to spawn threads within each of your processes you may be able to use
multiprocessing.connection.Listener/Client
to communicate back and forth. Rather than spawning threads I took an approach using network sockets and select to communicate between threads.Dynamic process spawning and network sockets may still be flaky depending on how
multiprocessing
cleans up your file descriptors when spawning/forking a new process and your solution will most likely work more easily on *nix derivatives. If you're concerned about socket overhead you could use unix domain sockets to be a little more lightweight at the cost of added complexity running nodes on multiple worker machines.Anyway, here's an example using network sockets and a global process list to accomplish this since I was unable to find a good way to make
multiprocessing
do it.With a lot of polish and testing love this might be a logical extension to
multiprocessing.Process
and/ormultiprocessing.Pool
as this does seem like something people would use if it were available in the standard lib. It may also be reasonable to create a DynamicQueue class that uses sockets to be discoverable to other queues.Anyway, hope it helps. Please update if you figure out a better way to make this work.
该代码基于已接受的答案。由于 OSX Snow Leopard 在多处理内容的某些使用上出现段错误,因此使用了 Python 3。
This code is based on the accepted answer. It's in Python 3 since OSX Snow Leopard segfaults on some uses of multiprocessing stuff.
提供标准库 socketserver 是为了帮助避免手动编程 select()。在此版本中,我们在单独的线程中启动套接字服务器,以便每个进程都可以在其主循环中进行(好吧,假装进行)计算。
The standard library socketserver is provided to help avoid programming select() manually. In this version, we start a socketserver in a separate thread so that each Process can do (well, pretend to do) computation in its main loop.