将 Solaris 迁移到 RH:网络延迟问题、tcp 窗口大小和其他 TCP 参数
我有一个客户端/服务器应用程序 (Java),我正在将其从 Solaris 迁移到 RH Linux。 自从我开始在 RH 中运行它以来,我注意到一些与延迟相关的问题。 我设法隔离了如下问题:
- 客户端连续向服务器发送 5 条消息(每条 32 字节)(相同的应用程序时间戳)。
- 服务器回显消息。
- 客户端接收回复并打印每条消息的往返时间。
在 Solaris 中,一切都很好:我同时收到所有 5 个回复,大约在发送原始消息后 80 毫秒(客户端和服务器彼此相距数千英里:我的 ping RTT 是 80 毫秒,一切正常)。
在 RH 中,前 3 条消息正常回显(它们在发送后 80 毫秒到达),但后面 2 条消息在 80 毫秒后到达(因此总共 160 毫秒 RTT)。
模式总是一样的。显然看起来像是 TCP 问题。
在我的solaris机器上,我之前用2个特定选项配置了tcp堆栈:
- 全局禁用nagle算法
- 在RH上将tcp_deferred_acks_max设置为0
,不可能全局禁用nagle,但我在所有应用程序的套接字上禁用了它(TCP_NODELAY) 。
所以我开始使用 tcpdump (在服务器计算机上),并比较两个输出:
SOLARIS:
22 2.085645 client server TCP 56150 > 6006 [PSH, ACK] Seq=111 Ack=106 Win=66672 Len=22 "MSG_1 RCV"
23 2.085680 server client TCP 6006 > 56150 [ACK] Seq=106 Ack=133 Win=50400 Len=0
24 2.085908 client server TCP 56150 > 6006 [PSH, ACK] Seq=133 Ack=106 Win=66672 Len=22 "MSG_2 RCV"
25 2.085925 server client TCP 6006 > 56150 [ACK] Seq=106 Ack=155 Win=50400 Len=0
26 2.086175 client server TCP 56150 > 6006 [PSH, ACK] Seq=155 Ack=106 Win=66672 Len=22 "MSG_3 RCV"
27 2.086192 server client TCP 6006 > 56150 [ACK] Seq=106 Ack=177 Win=50400 Len=0
28 2.086243 server client TCP 6006 > 56150 [PSH, ACK] Seq=106 Ack=177 Win=50400 Len=21 "MSG_1 ECHO"
29 2.086440 client server TCP 56150 > 6006 [PSH, ACK] Seq=177 Ack=106 Win=66672 Len=22 "MSG_4 RCV"
30 2.086454 server client TCP 6006 > 56150 [ACK] Seq=127 Ack=199 Win=50400 Len=0
31 2.086659 server client TCP 6006 > 56150 [PSH, ACK] Seq=127 Ack=199 Win=50400 Len=21 "MSG_2 ECHO"
32 2.086708 client server TCP 56150 > 6006 [PSH, ACK] Seq=199 Ack=106 Win=66672 Len=22 "MSG_5 RCV"
33 2.086721 server client TCP 6006 > 56150 [ACK] Seq=148 Ack=221 Win=50400 Len=0
34 2.086947 server client TCP 6006 > 56150 [PSH, ACK] Seq=148 Ack=221 Win=50400 Len=21 "MSG_3 ECHO"
35 2.087196 server client TCP 6006 > 56150 [PSH, ACK] Seq=169 Ack=221 Win=50400 Len=21 "MSG_4 ECHO"
36 2.087500 server client TCP 6006 > 56150 [PSH, ACK] Seq=190 Ack=221 Win=50400 Len=21 "MSG_5 ECHO"
37 2.165390 client server TCP 56150 > 6006 [ACK] Seq=221 Ack=148 Win=66632 Len=0
38 2.166314 client server TCP 56150 > 6006 [ACK] Seq=221 Ack=190 Win=66588 Len=0
39 2.364135 client server TCP 56150 > 6006 [ACK] Seq=221 Ack=211 Win=66568 Len=0
REDHAT:
17 2.081163 client server TCP 55879 > 6006 [PSH, ACK] Seq=111 Ack=106 Win=66672 Len=22 "MSG_1 RCV"
18 2.081178 server client TCP 6006 > 55879 [ACK] Seq=106 Ack=133 Win=5888 Len=0
19 2.081297 server client TCP 6006 > 55879 [PSH, ACK] Seq=106 Ack=133 Win=5888 Len=21 "MSG_1 ECHO"
20 2.081711 client server TCP 55879 > 6006 [PSH, ACK] Seq=133 Ack=106 Win=66672 Len=22 "MSG_2 RCV"
21 2.081761 client server TCP 55879 > 6006 [PSH, ACK] Seq=155 Ack=106 Win=66672 Len=22 "MSG_3 RCV"
22 2.081846 server client TCP 6006 > 55879 [PSH, ACK] Seq=127 Ack=177 Win=5888 Len=21 "MSG_2 ECHO"
23 2.081995 server client TCP 6006 > 55879 [PSH, ACK] Seq=148 Ack=177 Win=5888 Len=21 "MSG_3 ECHO"
24 2.082011 client server TCP 55879 > 6006 [PSH, ACK] Seq=177 Ack=106 Win=66672 Len=22 "MSG_4 RCV"
25 2.082362 client server TCP 55879 > 6006 [PSH, ACK] Seq=199 Ack=106 Win=66672 Len=22 "MSG_5 RCV"
26 2.082377 server client TCP 6006 > 55879 [ACK] Seq=169 Ack=221 Win=5888 Len=0
27 2.171003 client server TCP 55879 > 6006 [ACK] Seq=221 Ack=148 Win=66632 Len=0
28 2.171019 server client TCP 6006 > 55879 [PSH, ACK] Seq=169 Ack=221 Win=5888 Len=42 "MSG_4 ECHO + MSG_5 ECHO"
29 2.257498 client server TCP 55879 > 6006 [ACK] Seq=221 Ack=211 Win=66568 Len=0
所以,我得到确认 RH: packet 无法正常工作28 发送得太晚了,看起来服务器在执行任何操作之前正在等待数据包 27 的 ACK。
在我看来,这是最可能的原因...
然后我意识到“Win”参数在 Solaris 和 Windows 上是不同的。 RH 转储:Solaris 上为 50400,RH 上仅为 5888。这是另一个提示...
我阅读了有关滑动窗口的文档&缓冲区窗口,并使用 rcvBuffer & 进行操作我在我的套接字上使用java中的sendBuffer,但从未设法将这个5888值更改为其他值(我每次都直接使用tcpdump进行检查)。
有人知道该怎么做吗?我很难获得明确的信息,因为在某些情况下,我可能需要绕过“自动协商”等...
我最终通过设置“tcp_slow_start_after_idle”仅部分摆脱了最初的问题RH 上的参数设置为 0,但它根本没有改变“win”参数。前 4 组 5 条消息也存在同样的问题,包括 TCP 重传和重传。 TCP Dup ACK 在 tcpdump 中,然后对于所有后续 5 条消息组,问题完全消失。
对我来说,这似乎不是一个非常干净和/或通用的解决方案。我真的很想在两种操作系统下重现完全相同的条件。
我将继续研究,但 TCP 专家的任何帮助将不胜感激!
I have a client/server app (Java) that I'm migrating from Solaris to RH Linux.
since I started running it in RH, I noticed some issues related to latency.
I managed to isolate the problem that looks like this:
- client sends 5 messages (32 bytes each) in a row (same application timestamp) to the server.
- server echos messages.
- client receives replies and prints round trip time for each msg.
in Solaris, all is well: I get ALL 5 replies at the same time, roughly 80ms after having sent original messages (client & server are several thousands miles away from each other: my ping RTT is 80ms, all normal).
in RH, first 3 messages are echoed normally (they arrive 80ms after they've been sent), however the following 2 arrive 80ms later (so total 160ms RTT).
the pattern is always the same. clearly looked like a TCP problem.
on my solaris box, I had previously configured the tcp stack with 2 specific options:
- disable nagle algorithm globally
- set tcp_deferred_acks_max to 0
on RH, it's not possible to disable nagle globally, but I disabled it on all of my apps' sockets (TCP_NODELAY).
so I started playing with tcpdump (on the server machine), and compared both outputs:
SOLARIS:
22 2.085645 client server TCP 56150 > 6006 [PSH, ACK] Seq=111 Ack=106 Win=66672 Len=22 "MSG_1 RCV"
23 2.085680 server client TCP 6006 > 56150 [ACK] Seq=106 Ack=133 Win=50400 Len=0
24 2.085908 client server TCP 56150 > 6006 [PSH, ACK] Seq=133 Ack=106 Win=66672 Len=22 "MSG_2 RCV"
25 2.085925 server client TCP 6006 > 56150 [ACK] Seq=106 Ack=155 Win=50400 Len=0
26 2.086175 client server TCP 56150 > 6006 [PSH, ACK] Seq=155 Ack=106 Win=66672 Len=22 "MSG_3 RCV"
27 2.086192 server client TCP 6006 > 56150 [ACK] Seq=106 Ack=177 Win=50400 Len=0
28 2.086243 server client TCP 6006 > 56150 [PSH, ACK] Seq=106 Ack=177 Win=50400 Len=21 "MSG_1 ECHO"
29 2.086440 client server TCP 56150 > 6006 [PSH, ACK] Seq=177 Ack=106 Win=66672 Len=22 "MSG_4 RCV"
30 2.086454 server client TCP 6006 > 56150 [ACK] Seq=127 Ack=199 Win=50400 Len=0
31 2.086659 server client TCP 6006 > 56150 [PSH, ACK] Seq=127 Ack=199 Win=50400 Len=21 "MSG_2 ECHO"
32 2.086708 client server TCP 56150 > 6006 [PSH, ACK] Seq=199 Ack=106 Win=66672 Len=22 "MSG_5 RCV"
33 2.086721 server client TCP 6006 > 56150 [ACK] Seq=148 Ack=221 Win=50400 Len=0
34 2.086947 server client TCP 6006 > 56150 [PSH, ACK] Seq=148 Ack=221 Win=50400 Len=21 "MSG_3 ECHO"
35 2.087196 server client TCP 6006 > 56150 [PSH, ACK] Seq=169 Ack=221 Win=50400 Len=21 "MSG_4 ECHO"
36 2.087500 server client TCP 6006 > 56150 [PSH, ACK] Seq=190 Ack=221 Win=50400 Len=21 "MSG_5 ECHO"
37 2.165390 client server TCP 56150 > 6006 [ACK] Seq=221 Ack=148 Win=66632 Len=0
38 2.166314 client server TCP 56150 > 6006 [ACK] Seq=221 Ack=190 Win=66588 Len=0
39 2.364135 client server TCP 56150 > 6006 [ACK] Seq=221 Ack=211 Win=66568 Len=0
REDHAT:
17 2.081163 client server TCP 55879 > 6006 [PSH, ACK] Seq=111 Ack=106 Win=66672 Len=22 "MSG_1 RCV"
18 2.081178 server client TCP 6006 > 55879 [ACK] Seq=106 Ack=133 Win=5888 Len=0
19 2.081297 server client TCP 6006 > 55879 [PSH, ACK] Seq=106 Ack=133 Win=5888 Len=21 "MSG_1 ECHO"
20 2.081711 client server TCP 55879 > 6006 [PSH, ACK] Seq=133 Ack=106 Win=66672 Len=22 "MSG_2 RCV"
21 2.081761 client server TCP 55879 > 6006 [PSH, ACK] Seq=155 Ack=106 Win=66672 Len=22 "MSG_3 RCV"
22 2.081846 server client TCP 6006 > 55879 [PSH, ACK] Seq=127 Ack=177 Win=5888 Len=21 "MSG_2 ECHO"
23 2.081995 server client TCP 6006 > 55879 [PSH, ACK] Seq=148 Ack=177 Win=5888 Len=21 "MSG_3 ECHO"
24 2.082011 client server TCP 55879 > 6006 [PSH, ACK] Seq=177 Ack=106 Win=66672 Len=22 "MSG_4 RCV"
25 2.082362 client server TCP 55879 > 6006 [PSH, ACK] Seq=199 Ack=106 Win=66672 Len=22 "MSG_5 RCV"
26 2.082377 server client TCP 6006 > 55879 [ACK] Seq=169 Ack=221 Win=5888 Len=0
27 2.171003 client server TCP 55879 > 6006 [ACK] Seq=221 Ack=148 Win=66632 Len=0
28 2.171019 server client TCP 6006 > 55879 [PSH, ACK] Seq=169 Ack=221 Win=5888 Len=42 "MSG_4 ECHO + MSG_5 ECHO"
29 2.257498 client server TCP 55879 > 6006 [ACK] Seq=221 Ack=211 Win=66568 Len=0
so, I got confirmation things are not working correctly for RH: packet 28 is sent TOO LATE, it looks like the server is waiting for packet 27's ACK before doing anything.
seems to me it's the most likely reason...
then I realized that the "Win" parameters are different on Solaris & RH dumps: 50400 on Solaris, only 5888 on RH. that's another hint...
I read the doc about the slide window & buffer window, and played around with the rcvBuffer & sendBuffer in java on my sockets, but never managed to change this 5888 value to anything else (I checked each time directly with tcpdump).
does anybody know how to do this ? I'm having a hard time getting definitive information, as in some cases there's "auto-negotiation" that I might need to bypass, etc...
I eventually managed to get only partially rid of my initial problem by setting the "tcp_slow_start_after_idle" parameter to 0 on RH, but it did not change the "win" parameter at all. the same problem was there for the first 4 groups of 5 messages, with TCP retransmission & TCP Dup ACK in tcpdump, then the problem disappeared altogether for all following groups of 5 messages.
It doesn't seem like a very clean and/or generic solution to me. I'd really like to reproduce the exact same conditions under both OSes.
I'll keep researching, but any help from TCP gurus would be greatly appreciated !
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
看起来拥塞避免算法正在红帽机器上发挥作用。
请注意,从数据包 26 开始,服务器已看到并确认了来自客户端的所有内容,但客户端仅确认了服务器的初始 SYN - 它尚未确认服务器的任何消息。另请注意,数据包 27(再次启动)是客户端确认服务器的前两批数据(数据包 19 和 22)。
红帽盒子使用哪种拥塞控制算法? (
/proc/sys/net/ipv4/tcp_congestion_control
) - 您可以尝试切换到其他可用的之一。It looks like the congestion avoidance algorithm is kicking in on the Red Hat box.
Notice that as of packet 26, the server has seen and ACKed everything from the client, but the client has only ACKed the server's initial SYN - it hasn't ACKed any of the server's messages yet. Notice also that packet 27, which kick-starts things again, is the client acknowledging the server's first two lots of data (packets 19 and 22).
Which congestion control algorithm is the Red Hat box using? (
/proc/sys/net/ipv4/tcp_congestion_control
) - you could try switching to one of the other available ones.