在非阻塞应用程序中旋转而不占用 CPU 时间
我有一个 UDP 网络应用程序,它读取发送给它的数据包,然后处理它们(同一线程)。读取是非阻塞的,所以我没有使用 poll 或 select。
收到的数据包按会话分组。
工作取决于是否有正在进行的会话。如果没有工作要做,即没有会话,或者没有要处理的数据包,那么我需要旋转。
我一直在研究这里找到的混合算法: http://www.1024cores.net/home/lock-free-algorithms /tricks/spinning
一直在玩它。我听说这更多是为了忙碌的等待。您使用什么方法来防止不必要的处理和不必要的高 CPU 使用率?
编辑:
感谢所有的答案和评论。 我现在正在做以下事情。当谈到从网络上阅读时,我会看看是否还有其他工作要做。如果有,那么我调用 poll 并将超时设置为零。然后,我读取尽可能多的数据包,并将它们放入内存队列中进行处理。如果没有其他工作,那么我轮询不确定(即-1)。看起来运行良好,CPU 仅在繁忙时才会很高,否则会降至零。
I have a UDP network application that reads packets sent to it and then processes them (same thread). The reads are non-blocking so I'm not using poll or select.
Packets received are grouped by sessions.
Work is governed by whether there is a session in progress. If there is no work to be done i.e. there are no sessions, or there are no packets to process then I need to spin.
I've been looking at the hybrid algorithm found here:
http://www.1024cores.net/home/lock-free-algorithms/tricks/spinning
Been playing with it. I'm told it's more for busy waits. What methods do you use to prevent unnecessary processing and needlessly high CPU usage?
EDIT:
Thanks for all the answers and comments.
I'm now doing the following. When it comes to reading from the network I look to see if there is other work to be done. If there is, then I call poll with a timeout of zero. I then read as many packets as I can and place them into an in memory queue for processing. If no other work then I poll indefinite (i.e. -1). It seems to work well, CPU is only high when things are busy, otherwise it drops to zero.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果你无事可做,那么你应该阻塞 - 如果不是在套接字本身上(即,如果它是一个处理多个网络套接字或事件类型的事件循环),那么在一个门上当发生某些事情时发出信号(设计取决于操作系统如何执行异步 I/O)。
仅当您等待非常很短的时间时才应该执行旋转(通常仅在内核模式下)。
If you have nothing to do, you should be blocking - if not on the socket itself (i.e. if it's an event loop that processes more than one network socket or event type), then on a gate that gets signaled when something happens (the design depends on how your OS does async I/O).
Spinning is something you should only be doing when you're waiting for a very short period of time (usually only in kernel mode).
您每秒处理多少个数据包?处理这些数据包需要多长时间?如果您使用阻塞线程,您获得的平均 CPU 使用率是多少?
除非阻塞等待接近 100% 使用率(从阻塞本身削减一些性能会有所帮助),否则旋转不会提高性能,反而会恶化性能。通过旋转,您锁定一个无法运行其他代码的核心(可能包括为您提供工作的代码:即读取网络并将数据包传递给您的应用程序的内核代码),您无需执行任何工作即可消耗资源完全...
请注意,当文章说编写阻塞代码比非阻塞自旋等待更难时,作者并不是在谈论系统中实现阻塞版本的操作,而是针对在线程上实现的情况必须等待其他线程触发的条件(共享变量值高于/低于限制,标志更改...)。
此外,如果检查条件的成本很高,那么循环的每次迭代都会产生旋转成本,并且这可能远远超过检查一次并执行昂贵的等待的成本。
请记住,旋转是一种主动等待,询问如何在不消耗处理器的情况下主动等待是没有意义的,因为主动等待方法暗示 > 消耗处理器时间。 如何避免不必要的 CPU 使用? 使用阻塞调用来获取下一个数据包。在读取 UDP 数据包的特定情况下,我怀疑对非阻塞读取的两次调用在处理时间上并不比对阻塞读取操作的单次调用更昂贵。
再次思考一开始的问题,这些问题可以总结为:阻塞是否被证明是瓶颈? *在这种情况下,主动等待实际上可以提供帮助吗?*
How many packets per second are you processing? How long does it take to process those packets? If you use blocking threads, what is the average CPU usage you get?
Unless blocking wait is close to 100% usage, where shaving a few bits of performance from the blocking itself can help, spinning will not improve but rather worsen performance. By spinning, you lock one core that will not be available to run other code (possibly including the code that feeds you with work: i.e. kernel code that reads network and passes up to your app the packets), you burn resources without performing any work at all...
Note that when the article says that it is harder to write blocking code than non blocking spin waits, the author is not talking about operations for which the blocking version is implemented in the system, but rather for situations where on thread must wait on a condition triggered by other threads (a shared variable value goes above/below a limit, a flag is changed...).
Also, if the cost of checking the condition is high, then spinning will incur in that cost for each and every iteration of the loop, and that might well exceed the cost of checking once and performing an expensive wait.
Remember that spinning is an active wait, it does not make sense to ask how to actively wait while not consuming processor, as the active wait approach implies consuming processor time. What can you do to avoid needless CPU usage? Use a blocking call to get the next packet. In the particular case of reading an UDP packet I doubt that two calls to the non-blocking read are not more expensive in processing time than a single call to the blocking read operation.
Again think on the questions in the beginning, that can be summed to: Is blocking proven to be the bottleneck? *Is this an scenario where active waits can actually help?*
由于您必须从套接字读取,因此您可以进行阻塞读取。没有数据包,你就没有理由跑步,对吗?
如果有多个套接字,则阻塞读取将不起作用,因此您需要 pselect() 监视多个描述符。
我错过了一些明显的东西吗?
我认为在收到数据报后,您可能会进行一些长期处理。如果您使用非阻塞 I/O 的原因是为了避免在处理会话时忽略传入流量,那么在这种情况下,显而易见的做法就是 fork() 会话。 (嗯,所以我仍然认为我一定错过了一些东西......)
Since you have to read from a socket, you can just do a blocking read. Without a packet, you have no reason to be running, right?
If there is more than one socket, then the blocking read won't work, so you need pselect() to monitor multiple descriptors.
Am I missing something obvious?
It occurs to me that you may have some long-term processing after you do receive a datagram. If the reason you are going with non-blocking I/O is to avoid ignoring incoming traffic while working on a session, then in that case the obvious thing to do is to fork() the sessions. (Hmm, so I still think I must be missing something...)