《迷失》 UDP数据包(JBoss+DatagramSocket)
我开发了一些基于JBoss+EJB的企业应用程序的一部分。我的模块需要处理大量传入的 UDP 数据包。我已经做了一些负载测试,看起来在以 11 毫秒间隔发送数据包的情况下一切都很好,但在 10 毫秒间隔的情况下一些数据包会丢失。在我看来这很奇怪,但我多次进行了 10/11ms 间隔负载测试比较,结果总是相同的(10 毫秒 - 一些“丢失”数据包,11 毫秒 - 一切都很好)。
如果同步出现问题,我预计在 11 毫秒测试中也会出现这种情况(至少有一个数据包丢失,或者至少有一个错误的计数器值)。 因此,如果不是因为同步,那么我接收数据包的 DatagramSocket 可能无法按预期工作。
我发现接收缓冲区大小(SO_RCVBUF)具有默认值 57344(可能与底层 IO 网络缓冲区相关)。我怀疑,也许当这个缓冲区已满时,新传入的 UDP 数据报会被拒绝。我尝试将此值设置为更高一些,但我注意到如果我夸大,缓冲区将返回到其默认大小。如果它依赖于底层,我如何从 JBoss 级别找到某些操作系统/网卡的最大缓冲区大小?
是否有可能是由接收缓冲区大小引起的,或者 57344 值足以处理大多数情况?您有处理此类问题的经验吗?
我的 DatagramSocket 上没有设置超时。我的 UDP 数据报包含大约 70 字节的数据(不包括数据报头的值)。
[已编辑] 我必须使用 UDP,因为我接收 Cisco Netflow 数据 - 它是网络设备用来发送一些流量统计数据的协议。另外,我对发送的字节格式没有影响(例如,我无法添加数据包计数器等)。预计不会处理所有数据包(某些数据报可能会丢失),但我希望我会处理大部分数据包。在 10ms 间隔测试中,大约有 30% 的数据包丢失。
处理速度慢导致此问题的可能性不大。当前单例组件保存对循环中调用接收方法的 DatagramSocket 的引用。当接收到数据包时,它被传递到队列,并由从池中无状态组件中挑选来处理。 “Facade”Singleton 只负责接收数据包并将其传递给处理(它不等待处理完成事件)。
提前致谢, 皮奥特尔
I develop part of some JBoss+EJB based enterprise application. My module needs to process huge amount of incoming UDP packets. I've done some load testing and it looks that in case of sending packets with 11ms interval everything is fine, but in case of 10ms interval some packets are lost. It's rather strange in my opinion, but I done 10/11ms interval load tests comparison several times and it is always the same result (10 ms - some "lost" packets, 11ms - everything's fine).
If it was something wrong with synchronization, I'd expect that it will also be visible in case of 11ms tests (at least one packet lost, or at least one wrong counter value).
So if it is not because of synchronization, then maybe DatagramSocket through which I receive packets doesn't work as expected.
I found that receive buffer size (SO_RCVBUF) has default 57344 value (probably it's underlying IO network buffers dependent). I suspect, that maybe when this buffer goes full, then new incoming UDP datagrams are rejected. I tried set this value to some higher, but I noticed that if I exaggerate, buffer returns to its default size. If it's underlying layer dependent how can I find out maximum buffer size for certain OS/network card from JBoss level?
Is it possible that it is caused by receive buffer size, or maybe 57344 value is big enough to handle most cases? Do you have any experience with such issues?
There is no timeout set on my DatagramSocket. My UDP datagrams contains about 70 bytes of data (value without datagram header included).
[Edited]
I have to use UDP because I receive Cisco Netflow data - it is protocol used by network devices to send some traffic statistics. Also, I have no influence on sent bytes format (e.g. I cannot add counters for packets and so on). It is not expected that all packets will be processed (some datagrams may be lost), but I'd expect that I will process most of packets. During 10ms interval tests, about 30% of packets were lost.
It is not very possible that slow processing causes this issue. Currently singleton component holds reference to DatagramSocket calling receive method in a loop. When packet is received, it is passed to the queue, and processed by picked from pool stateless component. "Facade" Singleton is only responsible for receiving packets and passing it on to the processing (it does not wait for processing complete event).
Thanks in advance,
Piotr
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
UDP 不保证传送,因此您可以调整参数,但不能保证消息一定会传送,尤其是在数据传输量非常大的情况下。
如果需要保证传送,则应使用 TCP。
如果您需要(或想要)使用 UDP,您可以使用数字对每个数据包进行编码,并发送预期数量的数据包。例如,如果您发送了 10 个大数据包,则可以包含以下信息:数据包 1/10、数据包 2/10 等。这样您至少可以判断是否尚未收到所有数据包。如果您没有收到它们,您可以发送请求以重新发送那些丢失的数据包。
UDP does not guarantee delivery, so you can tweak parameters, but you can't guarantee that the message will get delivered, especially in the case of very large data transfers.
If you need to guarantee delivery, you should use TCP instead.
If you need (or want) to use UDP, you can encode each packet with a number, and also send the number of packets expected. For example, if you sent 10 large packets, you could include the information: packet 1/10, packet 2/10, etc. This way you can at least tell if you have not received all of the packets. If you have not received them, you could send a request to resend those missing packets.
UDP本质上是不可靠的。
数据报可以在发送方和接收方之间的任何点被丢弃,甚至在低于代码级别的接收方内也是如此。将接收缓冲区设置为更大的大小可能有助于机器内的网络代码缓冲更多数据报,但您应该预料到无论如何都会丢失一些数据报。
如果您的接收逻辑花费的时间太长(即比新数据报到达所需的时间更长),那么您将永远落后,并且最终将永远错过数据报。您所能做的就是确保您的recv代码尽可能快地运行,也许将入站数据报移至队列并“稍后”或在另一个线程上处理它,但这只会将您的问题转移到您有一个问题的地方队列不断增长。
[重新编辑...] 什么正在处理您的队列以及生产者和消费者之间的锁定如何工作?更改您的代码,以便接收逻辑仅增加计数并丢弃数据并循环回来,看看是否丢失了更少的数据报;无论哪种方式,UDP 都是不可靠的,您将有被丢弃的数据报,您应该预料到这一点并处理它。担心它意味着你关注的是错误的问题;利用您确实获得的数据并假设您不会获得太多数据,那么即使网络拥塞并且大部分数据报被丢弃,您的程序也将正常工作。
总而言之,UDP 就是这样。
UDP is inherently unreliable.
Datagrams can be thrown away at any point between sender and receiver, even within the receiver at a level below your code. Setting the recv buffer to a larger size is likely to help the networking code within your machine buffer more datagrams but you should expect that some datagrams will be lost anyway.
If your recv logic takes too long (i.e. longer than it takes for a new datagram to arrive) then you'll always be behind and you'll always miss datagrams eventually. All you can do is make sure that your recv code runs as fast as possible, perhaps move the inbound datagram to a queue and process it 'later' or on another thread but then that will just move your problem to being one where you have a queue that keeps growing.
[Re your edit...] And what's processing your queue and how does the locking work between the producer and the consumers? Change your code so that the recv logic simply increments a count and discards the data and loops back around and see if you're losing fewer datagrams; either way, UDP is unreliable, you WILL have datagrams that are discarded and you should just expect that and deal with it. Worrying about it means you're focusing on the wrong problem; make use of the data you DO get and assume that you wont get much of it and then your program will work even if the network gets congested and MOST of your datagrams get discarded.
In summary, that's just how is it with UDP.
在您的测试中,缓冲区中最多只能包含两个数据包,因此如果每个数据包小于 28KB,则应该没问题。
如您所知,UDP 是有损的,但您应该能够每 10 毫秒发送多个数据包。我建议您编写一个简单的接收器,它仅侦听数据包以确定其是否是您的应用程序或网络/操作系统级别的某些内容。 (我怀疑是后者)
It appears in your tests that only up to two packets can be in the buffer so if each packet is less than 28KB this should be fine.
As you know UDP is lossy, but you should be able to send more than one packet per 10 ms. I suggest you write a simple receiver which just listens to packets just to determine if its your application or something at the network/OS level. (I suspect the later)
我不知道Java,但是...API是否允许您调用数据报的异步侦听/接收:
如果这是真的,那么我建议您执行多个并发实例API 调用,以便有多个并发应用程序级缓冲区可以接收多个数据包。
I don't know Java but ... does the API allow you to invoke an asynch listen/receive for a datagram:
If that's true then I suggest that you do several concurrent instances of the API call, so that there are several concurrent application-level buffers into which multiple packets can be received.