一起使用 gevent 和 multiprocessing 与子进程通信
问:
我可以在 Windows 上高效地将多处理模块与 gevent 一起使用吗?
场景:
我有一个基于 gevent 的 Python 应用程序在 Windows 上执行异步 I/O。该应用程序主要受 I/O 限制,但也存在较高的 CPU 负载峰值。该应用程序需要通过其标准输入和标准输出来控制控制台应用程序。我无法修改此控制台应用程序,用户将能够使用他自己的自定义应用程序,只有基于文本(行)的通信协议是固定的。
我有一个使用子进程和线程的工作实现,但我宁愿将整个基于子进程的通信代码与这些线程一起移动到一个单独的进程中,以将主应用程序恢复为单线程。我计划为此使用多处理模块。
先前阅读:
我在网上搜索了很多并阅读了一些源代码,所以我知道多处理模块正在使用基于 Windows 上的命名管道的 Pipe 实现。一对 multiprocessing.queue.Queue 对象将用于与第二个 Python 进程通信。这些队列基于 Pipe 实现,例如 IPC 将通过命名管道完成。
关键问题是,调用传入Queue的get方法是否会阻塞gevent的主循环。该方法有一个超时,因此我可以将其放入一个具有较小超时的循环中,但这不是一个好的解决方案,因为它仍然会在很短的时间内阻塞 gevent,从而损害其低 I/O 延迟。
我也愿意接受有关如何规避在 Windows 上使用管道的整个问题的建议,众所周知,这很困难,有时也很脆弱。我不确定基于共享内存的 IPC 是否可以在 Windows 上实现。也许我可以以一种允许使用网络套接字与子进程通信的方式包装控制台应用程序,众所周知,这与 gevent 配合良好。
如果可能的话,请不要质疑我的主要用例。谢谢。
Question:
Can I use the multiprocessing module together with gevent on Windows in an efficient way?
Scenario:
I have a gevent based Python application doing asynchronous I/O on Windows. The application is mostly I/O bound, but there are spikes of higher CPU load as well. This application would need to control a console application via its stdin and stdout. I cannot modify this console application and the user will be able to use his own custom one, only the text (line) based communication protocol is fixed.
I have a working implementation using subprocess and threads, but I would rather move the whole subprocess based communication code together with those threads into a separate process to turn the main application back to single-threaded. I plan to use the multiprocessing module for this.
Prior reading:
I have been searching the Web a lot and read some source code, so I know that the multiprocessing module is using a Pipe implementation based on named pipes on Windows. A pair of multiprocessing.queue.Queue objects would be used to communicate with the second Python process. These queues are based on that Pipe implementation, e.g. the IPC would be done via named pipes.
The key question is, whether calling the incoming Queue's get method would block gevent's main loop or not. There's a timeout for that method, so I could make it into a loop with a small timeout, but that's not a good solution, since it would still block gevent for small time periods hurting its low I/O latency.
I'm also open to suggestions on how to circumvent the whole problem of using pipes on Windows, which is known to be hard and sometimes fragile. I'm not sure whether shared memory based IPC is possible on Windows or not. Maybe I could wrap the console application in a way which would allow communicating with the child process using network sockets, which is known to work well with gevent.
Please don't question my primary use case, if possible. Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Queue 的 get 方法确实是阻塞的。使用超时可能会解决您的问题,但它绝对不是一个最干净的解决方案,而且最重要的是,它会无缘无故地引入额外的延迟。即使它没有阻塞,这也不是一个好的解决方案。仅仅因为非阻塞本身还不够,好的异步调用/API 应该顺利地集成到所使用的 I/O 框架中。可以是 Python 的 gevent、C 的 libevent 或 C++ 的 Boost ASIO。
最简单的解决方案是通过生成控制台应用程序并附加到其控制台输入和输出描述符来使用简单的 I/O。有两个主要因素需要考虑:
但是,缺点是您必须启动该应用程序,不支持与其并发通信,也不支持通过网络进行通信。甚至还有一个对于初学者来说很好的例子。
为了保持简单但更灵活,您可以使用 TCP/IP 套接字。如果客户端和服务器都在同一台机器上运行。另外,一个好的操作系统会使用IPC作为底层实现,所以速度会很快。而且,如果您担心这种情况的性能,您可能根本不应该使用 Python 并考虑其他技术。
即使是喜欢的解决方案 - 使用 ZeroC ICE。这是非常现代的技术,允许几乎无缝的进程间通信。它是一个CORBA杀手,非常容易使用。它被许多人大量使用,被证明是同类产品中速度最快且稳定的。该解决方案的优点在于您可以无缝集成许多不同语言的程序,例如 Python、Java、C++ 等。但这需要您花一些时间来熟悉概念。如果您决定这样做,只需花一天时间阅读文档即可。
希望有帮助。祝你好运!
The Queue's get method is really blocking. Using it with timeout could potentially solve your problem, but it definitely won't be a cleanest solution and, which is the most important, will introduce extra latency for no good reason. Even if it wasn't blocking, that won't be a good solution either. Just because non-blocking itself is not enough, the good asynchronous call/API should smoothly integrate into the I/O framework in use. Be that gevent for Python, libevent for C or Boost ASIO for C++.
The easiest solution would be to use simple I/O by spawning your console applications and attaching to its console in and out descriptors. There are at two major factors to consider:
However, the downside is that you will have to start this application, there will be no support for concurrent communication with it, and there will be no support for communication over network. There is even a good example for starters.
To keep it simple but more flexible, you can use TCP/IP sockets. If both client and server are running on the same machine. Also, a good operating system will use IPC as an underlying implementation, so it will be fast. And, if you are worrying about performance of this case, you probably should not use Python at all and look at other technologies.
Even fancies solution – use ZeroC ICE. It is very modern technology allowing almost seamless inter-process communication. It is a CORBA killer, very easy to use. It is heavily used by many, proven to be fastest in its class and rock stable. The beauty of this solution is that you can seamlessly integrate programs in many different languages, like Python, Java, C++ etc. But this will require some of your time to get familiar with a concept. If you decide to go this way, just spend a day reading trough documentation.
Hope it helps. Good luck!
你的问题已经很老了。尽管如此,我还是想推荐http://gehrcke.de/gipc,我相信它可以解决以非常直接的方式概述了挑战。基本上,它允许您将基于多处理的子进程集成到应用程序中的任何位置(也在 Windows 上)。与
Process
对象的交互(例如调用join()
)是 gevent 协作的。通过其管道管理,它允许协作阻止进程间通信。然而,在 Windows 上,IPC 目前的效率比在 POSIX 兼容系统上低得多(因为非阻塞 I/O 是通过线程池模拟的)。根据应用程序的 IPC 消息传递量,这可能很重要,也可能不重要。Your question is already quite old. Nevertheless, I would like to recommend http://gehrcke.de/gipc which -- I believe -- would tackle the outlined challenge in a very straight-forward fashion. Basically, it allows you to integrate multiprocessing-based child processes anywhere in your application (also on Windows). Interaction with
Process
objects (such as callingjoin()
) is gevent-cooperative. Via its pipe management, it allows for cooperatively blocking inter-process communication. However, on Windows, IPC currently is much less efficient than on POSIX-compliant systems (since non-blocking I/O is imitated through a thread pool). Depending on the IPC messaging volume of your application, this might or might not be of significance.