为什么我的基于 Linux prio 的 tc 没有改善网络延迟?
我正在构建一个具有各种网络流量的实时嵌入式 Linux 应用程序。在这组流量中,有两个连接对时间要求严格。一个是输入数据,另一个是输出数据。我的应用程序需要此流量优先于其他非时间关键流量。
我关心两件事:
- 最大限度地减少由于这两个连接过载而丢失的数据包数量。
- 最大限度地减少这两个连接上的设备(输入到输出)的延迟。
我已经(在某种程度上!)了解了 Linux 流量控制,并了解它主要适用于出口流量,因为远程设备负责它发送给我的数据的优先级。我已将我的应用程序设置为实时进程,并解决了与运行它的优先级相关的问题。
我现在开始建立tc。对于我的测试用例,我使用的是:
tc qdisc add dev eth0 root handle 1: prio bands 3 priomap 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2
tc qdisc add dev eth0 parent 1:1 handle 10: pfifo
tc qdisc add dev eth0 parent 1:2 handle 20: pfifo
tc qdisc add dev eth0 parent 1:3 handle 30: pfifo
基本上我是说:通过频段 0 发送所有优先级 7 流量,并通过频段 2 发送所有其他流量。一旦我完成了这个简单的测试,我就会这样做更好地处理其他交通。
首先让我们验证一下我的期望: 我期望的是,任何具有优先级 7 的流量都应始终在具有任何其他优先级的流量之前出去。这应该使此类流量的延迟相对不受盒子上其他流量的影响,不是吗?我的 mtu 设置为 1500,通过接口获得的速度约为 10 MB/秒。频段 2 流量在频段 0 上造成的最大额外延迟为一个数据包(<=1500 字节)或 150 us(1500 字节/10 MBytes/sec = 150 us)。
这是我的测试设置:
两个 Linux 盒子。 Box 1 运行一个回显输入数据的 TCP 服务器。盒子 2 连接到盒子 1,通过 TCP 发送数据包并测量延迟(发送时间到接收时间)。
我对 Linux 盒子使用相同的 tc 设置。
在应用程序(服务器和客户端)中,我在套接字上设置 SO_PRIORITY,如下所示:
int so_priority = 7;
setsockopt(m_socket.native(), SOL_SOCKET, SO_PRIORITY, &so_priority, sizeof(so_priority));
我使用 tc 来验证我的流量是否超过频带 0,所有其他流量超过频带 2:
tc -s qdisc ls dev eth0
这是问题:当没有其他流量时流量,我看到延迟在 500 us 范围内。当我有其他流量时(例如,复制 100 MB 文件的 scp 作业),延迟会跃升至 10+ 毫秒。真正奇怪的是,我所做的所有 tc 工作都没有任何影响。事实上,如果我交换频段(因此我的所有流量都通过优先级较低的频段 2,而其他流量则通过频段 1),我不会看到延迟有任何差异。
我所期望的是,当网络上有其他流量时,我会看到延迟增加约 150 us,而不是 10 ms!顺便说一句,我已经验证了使用其他(非实时优先级)进程加载盒子不会影响延迟,也不会影响其他接口上的流量。
另一件值得注意的事情是,如果我将 mtu 降低到 500 字节,则延迟会减少到大约 5 毫秒。尽管如此,这仍然比卸载情况要差一个数量级。另外——为什么改变mtu影响这么大,但使用tc设置优先级队列却没有效果???
为什么 tc 不帮助我?我缺少什么?
谢谢!
埃里克
I am building a real-time embedded linux application that has a variety of network traffic. Of the set of traffic, two connections are time critical. One is the input data and the other for output data. My application needs this traffic to have priority over the other, non-time-critical traffic.
I care about two things:
- Minimize the number of dropped packets due to overload on these two connections.
- Minimize the latency through the device (input to output) on these two connnections.
I've come (somewhat!) up to speed on Linux traffic control, and understand that it primarly applies to egress traffic, as the remote device is responsible for the priority of data it sends to me. I have setup my application as a real time process and have worked through the issues related to what priority to run it at.
I now embark on setting up tc. For my test case, here is what I use:
tc qdisc add dev eth0 root handle 1: prio bands 3 priomap 2 2 2 2 2 2 2 0 2 2 2 2 2 2 2 2
tc qdisc add dev eth0 parent 1:1 handle 10: pfifo
tc qdisc add dev eth0 parent 1:2 handle 20: pfifo
tc qdisc add dev eth0 parent 1:3 handle 30: pfifo
Basically I am saying: Send all priority 7 traffic over band 0, and all other traffic over band 2. Once I have this simple test working I will do a better job handling other traffic.
First let's verify my expectations:
What I expect is that any traffic having priority 7 should always go out before traffic having any other priority. This should make the latency on such traffic be relatively unaffected by other traffic on the box, no? My mtu is set to 1500, and I am getting about 10 MB/sec through the interface. The maximum additional latency on band 0 caused by band 2 traffic is one packet (<=1500 bytes), or 150 us (1500 bytes / 10 MBytes/sec = 150 us).
Here is my test setup:
Two Linux Boxes. Box 1 running a TCP server that echos input data. Box 2 connects to box one, sends packets over TCP and measures the latency (time sent to time received).
I use the same tc setup for box Linux boxes.
In the applications (both server and client), I set the SO_PRIORITY on the socket as follows:
int so_priority = 7;
setsockopt(m_socket.native(), SOL_SOCKET, SO_PRIORITY, &so_priority, sizeof(so_priority));
I use tc to verify that my traffic goes over band 0, and all other traffic over band 2:
tc -s qdisc ls dev eth0
Here's the rub: When there is no other traffic, I see latencies in the 500 us range. When I have other traffic (for example, a scp job copying a 100 MB file), the latencies jump up to 10+ ms. What is really strange is that NONE of the tc work I did has any affect. In fact, if I swap the bands (so all my traffic goes over the lower priority band 2, and other traffic over band 1), I don't see any difference in latency.
What I was expecting is that when there is other traffic on the network, I would see an increase in latency of about 150 us, not 10 ms! By the way, I have verified that loading the box with other (non-real time priority) processes does not affect latency, nor does traffic on other interfaces.
One other item of note is that if I drop the mtu to 500 bytes, the latency decreases to about 5 ms. Still, this is an order of magnitude worse than the unloaded case. Also--why does changing the mtu affect it so much, but using tc to setup priority queuing has no effect???
Why is tc not helping me? What am I missing?
Thanks!
Eric
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您没有提及网络的其余部分,但我猜测您正在上游路由器上遇到队列,该路由器通常有很长的队列来优化吞吐量。解决这个问题的最佳方法是将优先级队列输入带宽略低于上行带宽的整形器。这样,您的批量优先级数据包将在您的盒子内排队,而不是在外部路由器上排队,从而允许您的高优先级数据包按照您的预期跳到队列的前面。
You didn't say anything about the rest of your network, but I'm guessing you're hitting a queue at an upstream router, which usually have long queues to optimize for throughput. The best way to fix it is to feed your priority queue into a shaper with a bandwidth just under your upstream bandwidth. That way your bulk-priority packets will queue up inside your box instead of at an external router, allowing your high-priority packets to jump to the front of the queue as you expect.
prio 设施将在发送数据包时简单地发送可用的最高优先级数据包(通常在发送前一个数据包后立即发送,除非没有数据包等待发送)。
您的测试依赖于每台机器上相应程序的进程将数据包放入队列中,以及从每台机器上的端口检索到的数据包。
影响进程在任一计算机上获取时间的任何调度延迟都可能会影响进程将消息放入队列或从队列检索和处理消息的能力。听起来您已经加载了至少一台机器来测试这一点,但我的经验是,机器加载肯定会影响像这样测量的延迟(以毫秒而不是微秒为单位),因此可能值得在两台机器都加载的情况下重复此操作具有高优先级任务。
另一件需要检查的事情是您用来测量延迟的时间戳 - 是客户端计算机实际收到回显消息的时间还是您的程序处理它的时间。如果是后者,那么您不仅要测量网络延迟,还要测量收到消息和程序获得处理器切片并到达检查时间之间的时间 - 请参阅 http://wiki.wireshark.org/Timestamps。
顺便说一句,我认为如果没有类似实时操作系统的机制,您将无法获得有保证的微秒级响应能力。另一方面,如果您的应用程序是类似 VoIP 的应用程序,那么您通常可以接受大约 200 毫秒的延迟。
The prio facility will simply send the highest priority packet available at the time when it is sending packets (typically as soon as the previous packet has been sent unless there are no packets waiting to go out).
Your test relies on the packets having been placed in the queue by the appropriate program's processes on each machine, and received packets having been retrieved from port on each machine.
Any scheduling delays which affect the time that a process gets on either machine might affect the process's ability to place a message on the queue or to retrieve and process a message form the queue. It sounds like you have loaded at least one of the machines to test for this but my experience is that machine loading will definitely affect measured latency like this (in the order of milliseconds not microseconds) so it might be worth repeating this with both machines loaded with high priority tasks.
The other thing to check is the timestamp you are using to measure the latency - is it the time the echoed message is actually received at the client machine or the time your program processes it. If the latter then you are measuring not just the network latency, but also the time between the message being received and your program getting a slice of the processor and getting to the point where you check the time - see http://wiki.wireshark.org/Timestamps.
As an aside, I don't think you will be able to to get guaranteed micro second level responsiveness without a real time OS like mechanism. On the other hand if your application is VoIP like then you will usually be OK up to about 200 millisecond latency.
您是否尝试捕获数据包并检查 IP 标头的 TOS 值是否已更改?
您需要 linux 2.6.39 或更高版本才能使用 SO_PRIORITY。
您应该更改 IP_TOS。
你应该设置:
Have you tried to capture the packets and check if the TOS value of the IP header have been changed ?
you need linux 2.6.39 or higher in order to use the SO_PRIORITY.
you should change the IP_TOS instead.
you should set: