IO 阻塞进程是否会在“top”中显示 100% CPU 使用率?输出?

发布于 2024-10-09 18:22:42 字数 298 浏览 0 评论 0原文

我有一个可以在不同数量的进程上并行进行的分析。预计这将是 IO 和 CPU 密集型的(如果有人好奇的话,非常高吞吐量的短读 DNA 对齐)。

运行此系统的系统是一个 48 核 Linux 服务器。

问题是如何确定最佳进程数以使总吞吐量最大化。在某些时候,进程可能会受到 IO 限制,因此添加更多进程将没有任何好处,甚至可能有害。

我可以通过标准系统监控工具判断何时达到该点吗? top(或者可能是不同的工具)的输出是否能让我区分 IO 密集型进程和 CPU 密集型进程?我怀疑 IO 阻塞的进程可能仍显示 100% CPU 利用率。

I have an analysis that can be parallelized over a different number of processes. It is expected that things will be both IO and CPU intensive (very high throughput short-read DNA alignment if anyone is curious.)

The system running this is a 48 core linux server.

The question is how to determine the optimum number of processes such that total throughput is maximized. At some point the processes will presumably become IO bound such that adding more processes will be of no benefit and possibly detrimental.

Can I tell from standard system monitoring tools when that point has been reached?
Would the output of top (or maybe a different tool) enable me to distinguish between a IO bound and CPU bound process? I am suspicious that a process blocked on IO might still show 100% CPU utilization.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

恍梦境° 2024-10-16 18:22:42

当进程在 IO 上被阻塞时,它不会运行,因此不会对其进行时间计算。如果有另一个进程可以运行,那么它将运行;如果没有,则时间被计为“IO等待”,这是一个全局统计。

IO 等待将是一个有用的监控工具。它在顶部的标题中显示为类似 %iw 的内容。您可以使用 iostat 和 vmstat 等工具更详细地监控它。 Serverfault 可能是一个更好的地方来询问这个问题。

When a process is blocked on IO, it isn't running, so no time is accounted against it. If there's another process that can run, then that will run instead; if there isn't, the time is counted as 'IO wait', which is accounted as a global statistic.

IO wait would be a useful thing to monitor. It shows up in top's header as something like %iw. You can monitor it in more detail with tools like iostat and vmstat. Serverfault might be a better place to ask about that.

木槿暧夏七纪年 2024-10-16 18:22:42

即使是单个 IO 密集进程也很少会表现出高 CPU 利用率,因为操作系统已调度其 IO 并且通常只是等待其完成。因此top无法准确区分IO密集型进程和仅定期使用CPU的非IO密集型进程。事实上,如果系统因所有 IO 密集型进程而严重超载,几乎无法完成任何任务,那么 CPU 利用率就会非常低。

仅使用 top 作为第一遍,您实际上可以仅继续添加线程/进程,直到 CPU 利用率趋于稳定,以确定给定计算机的大致配置。

Even a single IO-bound process will rarely show high CPU utilization because the operating system has scheduled its IO and is usually just waiting for it to complete. So top cannot accurately distinguish between an IO-bound process and a non-IO-bound process that merely periodically uses the CPU. In fact, a system horribly overloaded with all IO-bound processes, barely able to accomplish anything can exhibit very low CPU utilization.

Using only top, as a first pass, you can indeed merely keep adding threads/processes until CPU utilization levels off to determine the approximate configuration for a given machine.

桃气十足 2024-10-16 18:22:42

您可以使用 iostat 和 vmstat 等工具来显示进程在 I/O 上阻塞的时间。添加比您需要的更多的进程通常没有坏处,但好处会减少。您应该衡量吞吐量与流程作为总体效率的衡量标准。

You can use tools like iostat and vmstat to show how much time processes are spending blocked on I/O. There's generally no harm in adding more processes than you need, but the benefit decreases. You should measure throughput vs. processes as a measurement of overall efficiency.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文