使 UDP 在 .NET/Mono 中丢失更少的消息
我们目前正在为开源学术项目 Logbus-ng 执行一些基准测试。它基本上通过 UDP (RFC 5426) 和 TLS (RFC 5425) 实现 Syslog 协议。 我们知道 TLS 的优点是可靠性(即我们不会丢失消息),但也有性能方面的缺点。
我们有一个基准测试客户端,还有一个特殊的伪造的 Apache 安装,可以高速率发送消息。 我们的目标是将UDP数据包的丢失降到最低。 Apache 1.3.41 已被检测以通过 UDP 发送特殊日志消息(不是以 Syslog 格式,而是以我们在服务器端解析的特殊短语法),并且这样的检测使其在 httpd 启动时发送超过 2000 条消息,并且我们希望它发生:)
此外,我可以告诉您,在 Apache 的启动阶段,这个少量消息(与我们提交给 Apache 的其他工作负载相比)日志服务器)以极高的速率发送,可能会淹没 UDP。
现在,日志服务器与 HTTP 服务器位于不同的机器上,并且两者都没有像样的硬件(甚至不是双核 CPU,而是具有超线程的 Pentium 4)。日志服务器代码是用C#编写的。以下方法由 4 个高于正常优先级的线程运行
UdpClient _client;
IQueue<T>[] _byteQueues; //not really IQueue, but a special FIFO queue class that reduces overhead to the minimum
private void ListenerLoop()
{
IPEndPoint remoteEndpoint = new IPEndPoint(IPAddress.Any, 0);
while (_listen)
{
try
{
byte[] payload = _client.Receive(ref remoteEndpoint);
_byteQueues[
(((Interlocked.Increment(ref _currentQueue))%WORKER_THREADS) + WORKER_THREADS)%WORKER_THREADS].
Enqueue(payload);
}
catch (SocketException)
{
}
catch (Exception)
{
} //Really do nothing? Shouldn't we stop the service?
}
}
为了减少线程在 Receive 方法之外花费的时间,我们在收到消息后不解析消息,而是将其存储在 4 个特殊队列之一中,其他工作线程将读取该队列线程。据我所知,.NET调度程序是贪婪的,所以无论线程等待多长时间,更高优先级的线程都会被更早地调度并可能导致饥饿,所以这就是为什么我们目前不关心线程数量增加的原因应用程序(全球大约有 20 个)。
我们不仅提高了线程优先级,而且尝试将 UDP 缓冲区大小增加到 1MB。 的初始化代码ReceiveBufferSize的片段
try
{
Socket clientSock = new Socket(AddressFamily.InterNetwork, SocketType.Dgram, ProtocolType.Udp)
{
#if !MONO
//Related to Mono bug 643475
ExclusiveAddressUse = true,
#endif
};
if (ReceiveBufferSize >= 0) clientSock.ReceiveBufferSize = ReceiveBufferSize;
clientSock.Bind(localEp);
_client = new UdpClient {Client = clientSock};
}
catch (SocketException ex)
{
throw new LogbusException("Cannot start UDP listener", ex);
}
下面是运行时配置
... Apache发送的每条日志消息都很短,我认为不超过50个字节。我们在实验室中运行千兆位以太网。
在上次使用此类配置的实验中,日志服务器仅收到生成的 2900 多个日志中的 700 多个。 Wireshark 在 UDP 套接字上报告了超过 2900 条消息,但 Logbus(将所有收到的消息存储到文件中)的日志跟踪仅报告了这 700/800 条。执行 cat /proc/net/udp
并使用 lsof
进行欺骗以找到正确的行报告大量丢弃数据包。日志的发送速度肯定非常高。如果我们将 Apache 核心修改为在每次日志调用后休眠一小段时间(略小于一毫秒),我们会将损失减少到零,但性能也会降低到几乎为零。我们会做这样的测试,但我们必须证明 Logbus-ng 在现实场景中的有效性:(
我直接的问题是
- UdpClient.ReceiveBufferSize 是否有助于防止数据包丢失?我在 C# 中还能做什么?
- 显然是 应该也可以在 Mono 中工作,但是你知道该属性可能存在的错误吗?我的意思是,有人报告过错误吗?(Mono 2.8)
- 你知道首先向本地主机发送数据包是否可以减少数据包丢失吗? (我会在 Web 服务器计算机上运行日志服务器的特殊实例,然后通过 TLS 将日志转发到真实的日志服务器,这不会丢失)
- 您建议我减少什么?丢失率
我们只能使用 UDP 来传送消息,因为我们只有 C# API。
目前我们必须使用 Apache 进行特殊测试,并且 我希望已经清楚了。您可以在 SVN 如果有帮助
we are currently performing some benchmarks for an open source academic project, Logbus-ng. It basically implements Syslog protocol over UDP (RFC 5426) and TLS (RFC 5425).
We know that the advantage of TLS is reliability (ie. we won't lose messages) but with the drawback of performance.
We have a benchmarking client and also a special forged Apache installation that send messages at high rates.
Our goal is reduce the loss of UDP packets to the minimum. Apache 1.3.41 has been instrumented in order to send special log messages via UDP (not in Syslog format but in a special short syntax we parse on the server-side), and such instrumentation makes it send over 2000 messages when httpd start, and we want it to happen :)
More, I can tell you that during the ramp-up phase of Apache this small amount of messages (compared to other workloads we submitted to the log server) is sent at an extremely high rate, possibly flooding UDP.
Now, the log server is located on a different machine than the HTTP server, and both have barely decent hardware (not even a dual core CPU but a Pentium 4 with HyperThread). The log server code is in C#. The following method is run by 4 threads in AboveNormal priority
UdpClient _client;
IQueue<T>[] _byteQueues; //not really IQueue, but a special FIFO queue class that reduces overhead to the minimum
private void ListenerLoop()
{
IPEndPoint remoteEndpoint = new IPEndPoint(IPAddress.Any, 0);
while (_listen)
{
try
{
byte[] payload = _client.Receive(ref remoteEndpoint);
_byteQueues[
(((Interlocked.Increment(ref _currentQueue))%WORKER_THREADS) + WORKER_THREADS)%WORKER_THREADS].
Enqueue(payload);
}
catch (SocketException)
{
}
catch (Exception)
{
} //Really do nothing? Shouldn't we stop the service?
}
}
In order to reduce the time the thread spends outside the Receive method, we don't parse the message once received but store it inside one of 4 special queues that will be read by other worker threads. As far as I know, .NET scheduler is greedy, so no matter how long threads are waiting, higher priority threads will be scheduled earlier and potentially cause starvation, so this is why we currently don't care about the increasing number of threads in the application (they are globally around 20).
Not only we increase thread priority, but we try to increase the UDP buffer size to 1MB. Here is a fragment of the initialization code
try
{
Socket clientSock = new Socket(AddressFamily.InterNetwork, SocketType.Dgram, ProtocolType.Udp)
{
#if !MONO
//Related to Mono bug 643475
ExclusiveAddressUse = true,
#endif
};
if (ReceiveBufferSize >= 0) clientSock.ReceiveBufferSize = ReceiveBufferSize;
clientSock.Bind(localEp);
_client = new UdpClient {Client = clientSock};
}
catch (SocketException ex)
{
throw new LogbusException("Cannot start UDP listener", ex);
}
ReceiveBufferSize is configured at runtime...
Each log message sent by Apache is very short, I think no more than 50 bytes. We run a gigabit Ethernet in our lab.
During the last experiment with such a configuration, the log server received only 700+ of the more than 2900 generated. Wireshark reported more than 2900 messages on the UDP socket, but the log trace of Logbus (which stores all received messages into a file) reports only these 700/800. Doing cat /proc/net/udp
and tricking with lsof
to find the correct row reports lots of dropped packets. The logs were definitely sent at a very high rate. If we modify Apache core to sleep for a short time (little less than a millisecond) after each log call, we would reduce loss to zero, but performance would be also reduced to almost-zero too. We will do such a test, but we must prove the effectiveness of Logbus-ng in real-life scenarios :(
My straight questions are
- Does UdpClient.ReceiveBufferSize help preventing packet loss? What else can I do in C#?
- It is obviously supposed to work in Mono too, but do you know about possible bugs with that property? I mean, did anyone ever reported a bug? (Mono 2.8)
- Do you know if sending packets to localhost first may reduce packet loss? (I would run a special instance of the log server on the web server machine, then forward logs via TLS, which doesn't lose, to the real log server)
- What would you suggest me to decrease the loss rate
We must currently perform a special test with Apache, and we can only use UDP to deliver messages. We can't choose TLS because we have only C# APIs for that.
Thank you in advance for any help. I hope to have been clear. You can find the source code of the UDP receiver on SVN if it helps
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
ReceiveBufferSize 肯定会影响 UDP 套接字(即 UdpClient),如果数据包丢失是由于缓冲区溢出造成的,那么增加 ReceiveBufferSize 会有所帮助。
请记住,如果数据速率如此之高,以至于您无法足够快地从缓冲区读取足够长的时间,那么即使是最大的缓冲区也不可避免地会溢出。
我已经在Ubuntu上运行的Mono 2.6.7上有效地使用了UdpClient.Client.ReceiveBufferSize,所以我相信Mono实现很好,当然我还没有在Mono 2.8上使用它。
根据我的经验,以极高的速率向本地主机发送 UDP 数据包时,可能会出现一些数据包丢失的情况,尽管我从未在实际应用程序中遇到过这种数据包丢失的情况。因此,您可能会通过这种方法取得一些成功。
您还需要查看是否发生数据包丢失,数据包丢失可能是由于网络基础设施、数据包冲突、交换机可能由于交换机上的某些限制而丢弃数据包。
简而言之,在使用 UDP 时,您需要做好处理和预期数据包丢失的准备。
The ReceiveBufferSize definately affects UDP sockets (ie. UdpClient), if the packet loss is due to buffer overflow then yes increasing the ReceiveBufferSize will help.
Keep in mind that if the data rate is so high that you simple cannot read from the buffer quick enough for long enough then it is inevitable that you will overflow even the largest of buffers.
I have used UdpClient.Client.ReceiveBufferSize effectively on Mono 2.6.7 running on Ubuntu, so I believe the Mono implementation is fine, of course I have not used this with Mono 2.8 yet.
From my experience, sending UDP packets to localhost at extremely high rates, some packet loss is possible, though I have never experienced this packet loss in a realworld application. So you might have some success with this approach.
You also need to look at were the packet loss is occuring, it might be that the packet loss is due to network infrastructure, packet collisions, switch might be dropping the packets because of some limit on the switch.
Simply put, you need to be ready to handle and expect packet loss when using UDP.
如果您使用 udp,则必须预料到数据包丢失。丢包的原因有很多。对于您来说,很可能是因为通道上任何地方的数据包溢出。可能是您的开关,也可能是您的接收器。溢出的原因是 UDP 没有任何类型的拥塞控制。 Tcp确实有拥塞控制(慢启动),这就是为什么tcp永远不会(理论上在完美环境下)溢出。
防止 udp 传输溢出的有前途的方法是手动采用 tcp 慢启动拥塞控制策略。
回答你的直接问题
1. 否。参见 tcp 慢启动算法。
2.很可能这不是一个错误。 UDP就是这样设计的。这是有原因的,也是有必要的。
3. 不。代理没有帮助。
4. 防止因溢出而导致数据包丢失的最简单实现是在发送更多数据包之前等待接收方确认(注意接收方已成功接收数据包)。当然,这无助于防止由于其他原因导致的丢包。
If you are using udp, you must expect packet loss. There are a lot of reason for packet loss. For you, most probably its because of packet overflooding anywhere on the channel. could be your switch or could be your receiver. The reason for overflooding is because UDP doesnt have any kind of congestion control. Tcp does have congestion control (slow start) thats why with tcp, its never(theoretically under perfect environment) overflooding.
Promising way of preventing overflooding for udp transmission is by employing tcp slow start congestion control strategy manually.
To answer your straight questions
1. No. See tcp slow start algorithm.
2. Most probably its not a bug. Udp is designed that way. Theres a reason for that and theres a need for that.
3. No. Proxy does not help.
4. Simplest implementation to combat packet loss due to overflooding is to wait for acknowledge from receiver(noting that receiver has successfully received the packet) before sending more packets. Of course, this does not help in combating packet loss due to other reason.
显然,在 3.2.7 版本之前的 Mono 中,设置套接字接收缓冲区大小被严重破坏:它将设置为随机值,而不是指定的大小,因此尝试增加缓冲区大小实际上可能会使性能变得更差:-(
https://github.com/mono/mono/commit/bbd4ee4181787189fbb1f8ba6364afdd982ae706
Apparently, setting the socket receive buffer size was badly broken in mono up to version 3.2.7: Instead of the specified size, it would get set to random values, so trying to increase the buffer size could actually make performance worse :-(
https://github.com/mono/mono/commit/bbd4ee4181787189fbb1f8ba6364afdd982ae706