无限循环内的 select() 在 RHEL 4.8 虚拟机上使用的 CPU 明显多于 Solaris 10 计算机

发布于 2024-08-24 18:34:17 字数 646 浏览 11 评论 0原文

我有一个用 C 编写的守护程序应用程序,目前在 Solaris 10 计算机上运行,​​没有出现任何已知问题。我正在将其移植到 Linux 上。我不得不做出最小的改变。在测试过程中,它通过了所有测试用例。其功能没有任何问题。然而,当我在 Solaris 计算机上查看其“空闲”时的 CPU 使用情况时,它的 CPU 使用率约为 0.03%。在运行 Red Hat Enterprise Linux 4.8 的虚拟机上,同一进程使用所有可用的 CPU(通常在 90%+ 范围内)。

我的第一个想法是事件循环一定出了问题。事件循环是一个无限循环 (while(1)),并调用 select()。 timeval 设置为 timeval.tv_sec = 0timeval.tv_usec = 1000。对于流程正在执行的操作来说,这似乎足够合理。作为测试,我将 timeval.tv_sec 提高到 1。即使在这样做之后,我也看到了同样的问题。

关于 select 在 Linux 和 Unix 上的工作方式,我是否遗漏了什么?或者它与虚拟机上运行的操作系统的工作方式是否不同?或者也许还有其他我完全想念的东西?

还有一件事我不确定正在使用哪个版本的 vmware 服务器。不过大约一个月前才更新。

I have a daemon app written in C and is currently running with no known issues on a Solaris 10 machine. I am in the process of porting it over to Linux. I have had to make minimal changes. During testing it passes all test cases. There are no issues with its functionality. However, when I view its CPU usage when 'idle' on my Solaris machine it is using around .03% CPU. On the Virtual Machine running Red Hat Enterprise Linux 4.8 that same process uses all available CPU (usually somewhere in the 90%+ range).

My first thought was that something must be wrong with the event loop. The event loop is an infinite loop (while(1)) with a call to select(). The timeval is setup so that timeval.tv_sec = 0 and timeval.tv_usec = 1000. This seems reasonable enough for what the process is doing. As a test I bumped the timeval.tv_sec to 1. Even after doing that I saw the same issue.

Is there something I am missing about how select works on Linux vs. Unix? Or does it work differently with and OS running on a Virtual Machine? Or maybe there is something else I am missing entirely?

One more thing I am not sure which version of vmware server is being used. It was just updated about a month ago though.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

朱染 2024-08-31 18:34:18

正如 Zan Lynx 所说,timeval 是通过 Linux 上的 select 修改的,因此您应该在每次 select 调用之前重新分配正确的值。另外,我建议检查某些文件描述符是否处于特定状态(例如文件结束、对等连接关闭...)。也许移植在返回值(FD_ISSET 等)的分析中显示出一些潜在的错误。几年前,在 select 驱动循环的一个端口中,我也遇到过这种情况:我以错误的方式使用返回值,并且将一个关闭的 fd 添加到 rd_set 中,导致 select 失败。在旧平台上,错误的fd被使用为具有高于maxfd的值,因此它被忽略。由于同样的错误,程序无法识别选择失败(select() == -1)并永远循环。

再见!

As Zan Lynx said, the timeval is modified by select on linux, so you should reassign the correct value before each select call. Also I suggest to check if some of the file descriptor is in a particular state (e.g. end of file, peer connection closed...). Maybe the porting is showing some latent bug in the analisys of the returned values (FD_ISSET and so on). It happened to me too some years ago in a port of a select-driven cycle: I was using the returned value in the wrong way, and a closed fd was added to the rd_set, causing select to fail. On the old platform the wrong fd was used to have a value higher than maxfd, so it was ignored. Because of the same bug, the program didn't recognize the select failure (select() == -1) and looped forever.

Bye!

深巷少女 2024-08-31 18:34:17

我相信 Linux 通过将剩余时间写入 select() 调用的时间参数来返回剩余时间,而 Solaris 则不会。这意味着不了解 POSIX 规范的程序员可能不会重置 select 调用之间的时间参数。

这将导致第一个调用具有 1000 usec 超时,而所有其他调用使用 0 usec 超时。

I believe that Linux returns the remaining time by writing it into the time parameter of the select() call and Solaris does not. That means that a programmer who isn't aware of the POSIX spec might not reset the time parameter between calls to select.

This would result in the first call having 1000 usec timeout and all other calls using 0 usec timeout.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文