如何在python进程之间共享数据而不写入磁盘
你好, 我想在 python 和进程之间共享少量数据(< 1K)。数据是物理 pc/104 IO 数据,变化迅速且频繁 (24x7x365)。将有一个“服务器”写入数据,多个客户端读取其中的部分数据。 将运行的系统使用闪存(CF 卡)而不是硬盘驱动器,因此我担心基于文件的方案会磨损闪存。我还希望使用更少的电力(处理器时间),因为我们 100% 是太阳能供电的。
- 这是一个合理的担忧吗?我们可以把CF卡换成SSD。
- 使用 mmap 更改值是否会将数据物理写入磁盘,还是虚拟文件?
- 我们将在 Debian 上运行,因此也许用于 python 模块的 POSIX IPC 是最好的解决方案。有人用过吗?
- 有人尝试过 Python 对象共享 (POSH) 模块吗?乍一看它看起来很有希望,但它还处于“Alpha”阶段,似乎并没有积极开发。
谢谢
更新: 我们将最大数据更新速率降低至约 10 Hz,但更常见的是 1 Hz。仅当值发生变化时才会通知客户,而不是以恒定的更新率通知客户。 我们采用了多服务器/多客户端模型,其中每个服务器专门从事某种类型的仪器或功能。 由于事实证明大部分编程将由 Java 程序员完成,因此我们最终使用了基于 TCP 的 JSON-RPC。服务器将用 Java 编写,但我仍然希望用 Python 编写主客户端,并正在研究 JSON-RPC 实现。
Helllo,
I would like to share small amounts of data (< 1K) between python and processes. The data is physical pc/104 IO data which changes rapidly and often (24x7x365). There will be a single "server" writing the data and multiple clients reading portions of it.
The system this will run on uses flash memory (CF card) rather than a hard drive, so I'm worried about wearing out the flash memory with a file based scheme. I'd also like to use less power (processor time) as we are 100% solar powered.
- Is this a valid worry? We could possibly change the CF card to a SSD.
- Does changing a value using mmap physically write the data to disk or is this a virtual file?
- We will be running on Debian so perhaps the POSIX IPC for python module is the best solution. Has anyone used it?
- Has anyone tried the Python Object Sharing (POSH) module? It looks promising at first glance but it is in "Alpha" and doesn't seem to be actively being developed.
Thank You
UPDATE:
We slowed down the maximum data update rate to about 10 Hz, but more typically 1 Hz. Clients will only be notified when a value changes rather than at a constant update rate.
We have gone to a multiple servers/multiple clients model where each server specializes in a certain type of instrument or function.
Since it turned out that most of the programming was going to be done by Java programmers, we ended up using JSON-RPC over TCP. The servers wil be written in Java but I still hope to write the main client in Python and am investigation JSON-RPC implementations.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
将数据写入服务器进程中的文件的另一种方法可能是直接写入客户端进程:
使用 UNIX 域套接字(如果客户端在不同计算机上运行,则使用 TCP/IP 套接字)将每个客户端连接到服务器,并让服务器写入这些套接字。根据您的特定处理模型,选择客户端/套接字可以由服务器(例如循环)完成,也可以由客户端发出信号表示它们已准备好进行更多操作来完成。
An alternative to writing the data to file in the server process might be to directly write to the client processes:
Use UNIX domain sockets (or TCP/IP sockets if the clients run on different machines) to connect each client to the server, and have the server write into those sockets. Depending on your particular processing model, choosing a client/socket may be done by the server (e.g. round-robin) or by the clients signalling that they're ready for more.
创建一个 ramfs 分区并写入该分区。 (您可以使用 tmpfs,但与 tmpfs 不同的是,ramfs 不会交换到磁盘)。然而,由于 ramfs 没有大小限制,因此您必须注意不要耗尽内存;由于您只在那里写入一小部分数据,因此这应该不成问题。
这样,您的数据就不会被写入磁盘(注意:如果断电,您将丢失它们)。
另请参阅 ramfs 文档。
Create a ramfs partition and write to that. (You could use tmpfs, but unlike tmpfs, ramfs is not swapped to disk). However, as ramfs doesn't have a size limit, you must take care that you don't run out of memory; since you're only writing a tiny bit of data there, it shouldn't be a problem.
This way, your data won't ever be written to a disk (note: you will lose them if power fails).
See also the ramfs docs.
根据 关于 mmap 系统调用的维基百科文章,内存映射文件内容被写入更新时返回磁盘。
您是否看过多处理模块(在标准库中) - 特别是进程之间共享状态的部分?
Piskvor 提到的 Ramfs 似乎也是一个很好的解决方案 - 特别是当并非所有进程都是用 Python 编写时。
According to the Wikipedia article about the mmap system call, memory mapped files contents are written back to disk when updated.
Have you looked at the multiprocessing module (in standard library) - especially the part Sharing state between processes?
Ramfs as mentioned by Piskvor also seems like a good solution - especially when not all processes are written in Python.
在闪存系统上运行时,请确保文件系统设计正确,以最大限度地延长闪存的使用寿命(磨损均衡)。 JFFS,我相信其他人现在有能力有效地做到这一点。如果您使用这样的系统,您不应该过度担心闪存的使用,但如果您正在写入恒定的数据流,您肯定会希望避免在闪存上执行此操作。
使用 RAM 文件系统是个好主意。如果系统设计允许的话,更好的是完全避免使用文件系统。为此你提到了 POSH。我从未尝试过,但我们发现 Pyro (“PYthon 远程对象” )在一些类似的情况下是一个优雅且有效的解决方案。
当然还有标准库
multiprocessing
模块,它在进程之间的通信方式方面具有一些相似之处。对于这一领域的任何新开发,我都会从那里开始,只有在失败时才去其他地方。When running on flash systems, make sure your filesystem is designed properly to maximize the life of the flash memory (wear levelling). JFFS and, I believe, others are now capable of doing this effectively. If you use such a system, you shouldn't be overly concerned about using the flash, but certainly if you're writing a constant stream of data you'd want to avoid doing that on the flash.
Using a RAM filesystem is a good idea. Better yet is to avoid filesystems entirely if the system design will let you. To that end you mention POSH. I've never tried it, but we've found Pyro ("PYthon Remote Objects") to be an elegant and effective solution in some similar cases.
And of course there's the standard library
multiprocessing
module, which bears some similarities in terms of how it communicates between processes. I'd start there for any new development in this area, and go elsewhere only if it failed to pan out.