PUB/SUB 具有短期发布者和长期订阅者
上下文:操作系统:Linux (Ubuntu),语言:C(实际上是 Lua,但这应该不重要)。
我更喜欢基于 ZeroMQ 的解决方案,但会接受任何足够理智的解决方案。< /em>
注意:由于技术原因,我无法在此处使用 POSIX 信号。
我在一台机器(“workers”)上有几个相同的长寿命进程。
有时我需要通过命令行工具向每个进程传递控制消息。示例:
$ command-and-control worker-type run-collect-garbage
这台机器上的每个工作人员都应该收到一条 run-collect-garbage
消息。 注意:如果解决方案能够以某种方式适用于集群中所有机器上的所有工作人员,那就完美了,但我可以自己编写该部分。
如果我存储一些有关正在运行的工作人员的信息,这很容易完成。例如,将它们的 PID 保存在已知位置,并在已知路径上打开一个控制 Unix 域套接字,其中某处有 PID。或者打开 TCP 套接字并将主机和端口存储在某处。
但这需要仔细管理存储的信息——例如,如果工作进程突然终止怎么办? (没有什么难以管理的,但是,仍然是额外的麻烦。)此外,信息需要存储在某个地方,从而增加了额外的复杂性。
有没有一种好的方法可以以 PUB/SUB 方式做到这一点?也就是说,工作人员是订阅者,命令和控制工具是发布者,他们所知道的只是一个“通道 URL”,也就是说,可以通过该通道获取消息。
其他要求:
- 发送到控制通道的消息必须将工作人员从轮询中唤醒(选择,无论什么) 循环。
- 消息传递必须得到保证,并且必须到达每个正在监听的工作人员。
- 工作人员应该有一种方法来监视消息而不阻塞 - 最好是通过轮询/select/上面提到的任何循环。
- 理想情况下,工作进程在某种意义上应该是“服务器” - 他不应该费心保持与“通道服务器”(如果有)的持久连接等 - 或者这应该由框架透明地完成。
Context: OS: Linux (Ubuntu), language: C (actually Lua, but this should not matter).
I would prefer a ZeroMQ-based solution, but will accept anything sane enough.
Note: For technical reasons I can not use POSIX signals here.
I have several identical long-living processes on a single machine ("workers").
From time to time I need to deliver a control message to each of processes via a command-line tool. Example:
$ command-and-control worker-type run-collect-garbage
Each of workers on this machine should receive a run-collect-garbage
message. Note: it would be perfect if the solution would somehow work for all workers on all machines in the cluster, but I can write that part myself.
This is easily done if I will store some information about running workers. For example keep the PIDs for them in a known location and open a control Unix domain socket on a known path with a PID somewhere in it. Or open TCP socket and store host and port somewhere.
But this would require careful management of the stored information — e.g. what if worker process suddenly dies? (Nothing unmanageable, but, still, extra fuss.) Also, the information needs to be stored somewhere, thus adding an extra bit of complexity.
Is there a good way to do this in PUB/SUB style? That is, workers are subscribers, command-and-control tool is a publisher, and all they know is a single "channel url", so to say, on which to come for messages.
Additional requirements:
- Messages to the control channel must wake up workers from the poll (select, whatever)
loop. - Message delivery must be guaranteed, and it must reach each and every worker that is listening.
- Worker should have a way to monitor for messages without blocking — ideally by the poll/select/whatever loop mentioned above.
- Ideally, worker process should be "server" in a sense — he should not bother about keeping connections to the "channel server" (if any) persistent etc. — or this should be done transparently by the framework.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
通常,这种模式需要发布者的代理,即您发送到代理,该代理立即接受交付,然后可靠地转发到最终订阅者工作人员。 ZeroMQ 指南涵盖了实现此目的的几种不同方法。
http://zguide.zeromq.org/page:all
Usually such a pattern requires a proxy for the publisher, i.e. you send to the proxy which immediately accepts delivery and then that reliably forwads to the end subscriber workers. The ZeroMQ guide covers a few different methods of implementing this.
http://zguide.zeromq.org/page:all
考虑到您的要求,史蒂夫的建议似乎是最简单的:运行一个监听两个已知套接字的守护进程 - 工作人员连接到该套接字,命令工具推送到它,然后重新分配给连接的工作人员。
你可以通过有效地提名一名工作人员来做一些可能有效的复杂事情。例如,在启动时,工作人员尝试在可访问的地方绑定()PUB ipc://套接字,例如tmp。一个赢得了第二个IPC的bind()作为PULL套接字,并在其正常职责之上充当转发器设备,其他的则connect()到原始IPC。命令行工具 connect() 到第二个 IPC,并推送它的消息。存在的风险是获胜者死亡,留下一个锁定的文件。您可以在命令行工具中识别这一点,重新绑定然后睡眠(以允许建立连接)。不过,这有点复杂,我想我会选择代理!
Given your requirements, Steve's suggestion does seem the simplest: run a daemon which listens on two known sockets - the workers connect to that and the command tool pushes to it which redistributes to connected workers.
You could do something complicated that would probably work, by effectively nominating one of the workers. For example, on startup workers attempt to bind() a PUB ipc:// socket somewhere accessible, like tmp. The one that wins bind()s a second IPC as a PULL socket and acts as a forwarder device on top of it's normal duties, the others connect() to the original IPC. The command line tool connect()s to the second IPC, and pushes it's message. The risk there is that the winner dies, leaving a locked file. You could identify this in the command line tool, rebind then sleep (to allow the connections to be established). Still, that's all a little bit complex, I think I'd go with a proxy!
我认为您所描述的内容非常适合 gearmand/supervisord 实现。
Gearman 是一个很棒的任务队列管理器,supervisord 可以让您确保进程都在运行。它也是基于 TCP 的,因此您可以在不同的机器上拥有客户端/工作人员。
http://gearman.org/
http://supervisord.org/
我最近设置了多个 gearmand 节点,链接到多个工作人员,这样就不会出现单点故障
编辑: 抱歉 - 我的错,我刚刚重新阅读并发现这可能并不理想。
Redis 有一些漂亮且简单的 pub/sub 功能,我还没有使用过,但听起来很有希望。
I think what you're describing would fit well with a gearmand/supervisord implementation.
Gearman is a great task queue manager and supervisord would allow you to make sure that the process(es) are all running. It's TCP based too so you could have clients/workers on different machines.
http://gearman.org/
http://supervisord.org/
I recently set something up with multiple gearmand nodes, linked to multiple workers so that there's no single point of failure
edit: Sorry - my bad, I just re-read and saw that this might not be ideal.
Redis has some nice and simple looking pub/sub functionality that I've not used yet but sounds promising.
使用多播 PUB/SUB。您必须确保将
pgm
选项编译到您的 ZeroMQ 发行版 (man 7 zmq_pgm
) 中。Use a mulitcast PUB/SUB. You'll have to make sure the
pgm
option is compiled into your ZeroMQ distribution (man 7 zmq_pgm
).