C# UDP 多播套接字 - 如何处理链路故障
我正在开发一个 UDP 多播库,并有一个关于如何正确处理链接故障、断开/重新连接的 NIC 电缆等的问题。
在我的测试中,我有以下设置:
- 2 个服务器 sA 和 sB
- sA 正在发送 UDP 多播数据,并且sB 正在接收多播数据
- 服务器通过第 2 层 Cisco 千兆位交换机连接
例如,当我加入 sB 上的多播组时,我开始在该套接字上从 sA 的多播数据包接收数据。
现在,当我禁用/拔掉多播接收器 sB 绑定到的 NIC 时,我不会 收到任何套接字级别的错误(例如在 Socket.ReceiveAsync 中),我猜这是预期的,因为 UDP 是无连接的,但我希望我能收到某种通知/异常,因为多播接收器绑定的 IP 变得不可用。
无论如何,当我重新启用该 NIC 时,尽管发送者仍在同一个多播组上发送,但我没有收到任何更多数据。 我希望内核能够在硬件链路故障后真正处理重新加入多播组的问题,但看起来事实并非如此。但是,由于我也没有收到任何套接字级错误,因此我真的不知道如何检测多播接收器的链路故障? 是否需要设置某些套接字选项以便内核重新加入多播组? 到目前为止,我想到的唯一选择是侦听 System.Net.NetworkInformation.NetworkChange.NetworkAddressChanged 事件,并在收到我必须绑定的本地 IP 再次可用的通知时尝试重新绑定。 其他多播应用程序如何处理这种情况?
谢谢,
汤姆
I'm working on a UDP Multicast library and got a question on how to properly handle link failures, disconnected/reconnected NIC cables, etc.
In my test I have the following setup:
- 2 servers sA and sB
- sA is sending UDP multicast data and sB is receiving multicast data
- servers are connected through a Layer 2 Cisco gigabit switch
As an example, when I join the multicast group on sB I start receiving data on that socket from sA's multicast packets.
Now, when I disable/unplug the NIC to which the multicast receiver sB is bound, I'm not
receiving any socket level errors (e.g. in Socket.ReceiveAsync), which I guess is expected as UDP is connectionless, yet I was hoping I would get some kind of notification/exception as the IP that the multicast receiver is bound to becomes unavailable.
Anyways, when I reenable that NIC, I'm not receiving any more data although the sender is still sending on the same multicast group.
I was hoping that the Kernel would actually handle rejoining the multicast group after a hardware link failure but looks like it doesn't. However, since I'm not getting any socket level errors either, I don't really know how to detect a link failure for a multicast receiver?
Are there certain socket options that need to be set so the kernel would rejoin a multicast group?
The only option I came up with so far is listening for System.Net.NetworkInformation.NetworkChange.NetworkAddressChanged events and attempt to rebind when I get a notification that the local IP I have to bind to becomes available again.
How are other multicast applications handling that scenario?
Thanks,
Tom
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我建议您订阅以下事件: System.Net.NetworkInformation.NetworkChange.NetworkAvailabilityChanged
要解决可用网络脱机的问题,请设计您的事件处理程序以正常重置接收器。然后相反,当网络可用时,重新绑定您的接收器。
I recommend you subscribe to the following event: System.Net.NetworkInformation.NetworkChange.NetworkAvailabilityChanged
To address when network available is offline, design your event handler to gracefully reset the receiver. Then conversely for when network availability is online, rebind your receiver.
我无法详细说明,因为这是公司的秘密,我的公司协议如何工作,但在您的服务器和客户端之间定期进行编程。然后,您的软件可以在内部计算最后一次心跳到达的时间,并推断您是否遭受了某种网络/硬件故障。
您可以使用很多选项来尝试检测发生的故障,包括检查 NetworkAddressChanged,但实现心跳会更安全,因为它是一个易于实现的通用解决方案,并且应该涵盖几乎所有情况。
I cant go into detail because its a company secret how my company protocols work, but program in a heart beat periodically between your server and your clients. Your software then can internally time when the last heart beat arrived and deduce if you have suffered some sort of network / hardware failure.
There are plenty of options you can play around with to try to detect a failure occured including checking NetworkAddressChanged, but it will be safer to implement the heart beat because its a generic solution which is easy to implement and should cover nearly all cases.