Linux 上还存在 Thundering Herd 问题吗?
许多 Linux/Unix 编程书籍和教程都谈到“Thundering Herd Problem”,当多个线程或分叉在 select()
调用上被阻塞,等待侦听套接字的可读性。当连接建立时,所有线程和分支都会被唤醒,但只有一个成功调用 accept()
后“获胜”。与此同时,大量的CPU时间被浪费在无缘无故地唤醒所有线程/分叉上。
我注意到一个 项目 提供了“修复”针对linux内核中的这个问题,但这是一个非常旧的补丁。
我认为有两种变体;一种是每个 fork 执行 select()
,然后执行 accept()
,另一种是只执行 accept()
。
现代 Unix/Linux 内核在这两种情况下是否仍然存在雷群问题,还是只有“select()
then accept()
”版本?
Many Linux/Unix programming books and tutorials speak about the "Thundering Herd Problem" which happens when multiple threads or forks are blocked on a select()
call waiting for readability of a listening socket. When the connection comes in, all threads and forks are woken up but only one "wins" with a successful call to accept()
. In the meantime, a lot of CPU time is wasted waking up all the threads/forks for no reason.
I noticed a project which provides a "fix" for this problem in the linux kernel, but this is a very old patch.
I think there are two variants; One where each fork does select()
and then accept()
, and one that just does accept()
.
Do modern Unix/Linux kernels still have the Thundering Herd Problem in both these cases or only the "select()
then accept()
" version?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
多年来,大多数 UNIX/Linux 内核都会序列化对 Accept(2) 的响应,换句话说,如果多个线程针对单个打开的文件描述符阻塞在 Accept(2) 上,则只有一个线程被唤醒。
OTOH,正如您所描述的,许多(如果不是全部)内核在选择接受模式中仍然存在惊群问题。
我编写了一个简单的脚本( https://gist.github.com/kazuho/10436253 )验证问题是否存在,发现该问题存在于linux 2.6.32和Darwin 12.5.0(OS X 10.8.5)上。
For years, most unix/linux kernels serialize response to accept(2)s, in other words, only one thread is waken up if more than one are blocking on accept(2) against a single open file descriptor.
OTOH, many (if not all) kernels still have the thundering herd problem in the select-accept pattern as you describe.
I have written a simple script ( https://gist.github.com/kazuho/10436253 ) to verify the existence of the problem, and found out that the problem exists on linux 2.6.32 and Darwin 12.5.0 (OS X 10.8.5).
这是一个非常古老的问题,并且大部分已经不存在了。 Linux 内核(过去几年)在处理和路由网络堆栈上的数据包的方式方面发生了许多变化,并且包括许多优化以确保低延迟和公平性(即最大限度地减少饥饿)。
也就是说,select 系统仅通过其 API 就存在许多可扩展性问题。当您有大量文件描述符时,选择调用的成本非常高。这主要是因为必须构建、检查和维护传入和传出系统调用的 FD 集。
如今,执行异步 IO 的首选方法是使用 epoll。 API 更加简单,并且可以很好地跨各种类型的负载(许多连接、大量吞吐量等)进行扩展。
This is a very old problem, and for the most part does not exist any more. The Linux kernel (for the past few years) has had a number of changes with the way it handles and routes packets up the network stack, and includes many optimizations to ensure both low latency, and fairness (i.e., minimize starvation).
That said, the select system has a number of scalability issues simply by way of its API. When you have a large number of file descriptors, the cost of a select call is very high. This is primarily due to having to build, check, and maintain the FD sets that are passed to and from the system call.
Now days, the preferred way to do asynchronous IO is with epoll. The API is far simpler and scales very nicely across various types of load (many connections, lots of throughput, etc.)
它就在那里,而且是真实的。请参阅我们在 uwsgi 中看到的这个问题: https://github.com/unbit/uwsgi/ issues/2611
如果我禁用 uwsgi 中的 --thunder-lock 选项,这意味着 uwsgi 将不会使用系统的正确 api/锁定机制。在这种情况下,在我的峰值负载期间,我可以看到大量的上下文切换和大量的时间浪费。我的应用程序的响应时间始终如一。 (我正在谈论我的服务器上每分钟 1 个 Lac 请求)。
It's there and it's real. See this issue that we are seeing in uwsgi: https://github.com/unbit/uwsgi/issues/2611
If I disable the --thunder-lock option in uwsgi, that means uwsgi won't be using right api/locking mechanism of system. In that case during my peak load I could see lot of context switch and lot of time wasted. Consistent high response time of my application. (I am talking 1 Lac request per min on my server) at this moment.
我最近看到测试了一个场景,其中多个线程在侦听 unix 域套接字上进行轮询,然后接受连接。所有线程都使用 poll() 系统调用唤醒。
这是 Linux 内核的自定义构建,而不是发行版构建,因此也许有一个内核配置选项可以更改它,但我不知道那会是什么。
我们没有尝试epoll。
I recently saw tested a scenario where multiple threads polled on a listening unix-domain socket and then accepted the connection. All threads woke up using the poll() system call.
This was a custom build of the linux kernel rather than a distro build so perhaps there is a kernel configure option that changes it but I don't know what that would be.
We did not try epoll.
请参阅下面的链接,其中讨论了 epoll 的单独标志以避免此问题。
http://lwn.net/Articles/632590/
Refer the link below which talks about separate flags to epoll to avoid this problem.
http://lwn.net/Articles/632590/