评估/诊断时间连接在建立之前处于 SYN_RECV 状态
我正在尝试使用相当标准的 CentOS/Apache 设置来提高(虚拟)Web 服务器的性能,我注意到的一件事是新连接似乎“坚持”在 SYN_RECV 状态,有时持续几秒钟,然后才最终被由 Apache 建立和管理。
我的第一个猜测是,Apache 可能会达到它准备同时处理的连接数的限制,但是例如,在保持活动状态关闭的情况下,netstat 会报告一些已建立的连接(只是那些不涉及本地主机的连接,因此丢弃“管家”连接,例如Apache 和 Tomcat 之间),而启用 keep-alive 后,它将很高兴地建立多达 100 多个已建立的连接(但无论哪种方式,与 SYN_RECV 行为都没有明显的区别 - 通常有 10-20 个连接) SYN_RECV 在任何一个时间)。
人们对调查阻碍快速建立连接的瓶颈有哪些建议?
PS 后续问题:有谁知道第一次“击中”服务器后建立连接的时间的典型统计数据是什么?
更新以防其他人遇到这种情况:最后,我编写了一个小型 Java 程序来从 /proc/net/tcp 获取数据并进行分析,看来这种情况发生在一小部分连接中(尽管这仍然意味着在任何时候都可能有多个连接处于这种状态,因为它们可以保持这种状态数秒)并且看起来像是这些连接本地的问题。超过 90% 的连接仍在 << 时间内完成。 500ms,81% < 200毫秒。因此,如果其他人得到了这个信息,不一定需要立即恐慌。
I'm trying to improve the performance of a (virtual) web server with a fairly standard CentOS/Apache setup and one thing I noticed is that new connections seem to "stick" in the SYN_RECV state, sometimes for several seconds, before finally being established and handled by Apache.
My first guess was that Apache could be reaching the limit for the number of connections it's prepared to handle simultaneously, but e.g. with keep-alive off netstat is reporting a few established connections (just those not involving localhost, so discarding "housekeeping" connections e.g. between Apache and Tomcat), whereas with keep-alive on it will happily get up to 100+ established connections (but with no clear difference to the SYN_RECV behaviour either way -- there's typically 10-20 connections sitting in SYN_RECV at any one time).
What are people's recommendations for investigating where the bottleneck is that's preventing the connections from being established quickly?
P.S. Follow-on question: does anybody know what a TYPICAL statistic would be for the time for a connection to be established once first "hitting" the server?
Update in case anyone else encounters this: in the end, I wrote a small Java program to take data from /proc/net/tcp and analyse and it appears that this is happening for a small proportion of connections (although that still means that at any one time there can be a number of connections in this state, because they can stay this way for a number of seconds) and looks like an issue local to those connections. Over 90% of connections are still going through in < 500ms and 81% in < 200ms. So if others get this, there isn't necessarily need for panic immediately.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
尝试捕获数据包跟踪并查看 SYN ACK 是否正在重新传输(以及重新发送的次数)。这可能表明存在路由问题(SYN 通过路径 A 进入,SYN-ACK 通过已损坏的路径 B 进入)。
另请查看这些连接是否具有特定模式(例如源自同一网络)。
Try capturing a packet trace and see if SYN ACKs are being retransmitted (and the number of re-tx). This could indicate a routing issue (SYN comes in via path A and SYN-ACK goes via path B which is broken).
Also see if these connections have a specific pattern (such as originating from the same network).