使用“顶部” 在 Linux 中作为半永久性仪器
我正在尝试找到在开发运行嵌入式 Linux 的机器时使用“top”作为半永久性工具的最佳方法。 (该检测将从最终测试和生产版本中删除。)
我的第一遍是将其简单地添加到 init.d 中:
top -b -d 15 >/tmp/toploop.out &
它每 15 秒以“批处理”模式运行一次。 让我们假设 /tmp 有足够的空间...
问题:
- 15 秒是选择通用监控的好值吗?
- 除了磁盘空间之外,这对系统状态的影响有多严重?
- 还有哪些其他(也许更好)的工具可以像这样使用?
I'm trying to find the best way to use 'top' as semi-permanent instrumentation in the development of a box running embedded Linux. (The instrumentation will be removed from the final-test and production releases.)
My first pass is to simply add this to init.d:
top -b -d 15 >/tmp/toploop.out &
This runs top in "batch" mode every 15 seconds. Let's assume that /tmp has plenty of space…
Questions:
- Is 15 seconds a good value to choose for general-purpose monitoring?
- Other than disk space, how seriously is this perturbing the state of the system?
- What other (perhaps better) tools could be used like this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
查看 collectd。 它是一个非常轻量级的系统监控框架,专为性能而编码。
Look at collectd. It's a very light weight system monitoring framework coded for performance.
我们使用 sysstat 来监控这样的事情。
We use sysstat to monitor things like this.
您可能会发现带有延迟且无重复计数器的 vmstat 和 iostat 是更好的选择。
You might find that vmstat and iostat with a delay and no repeat counter is a better option.
我怀疑 15 秒就足够了,除非你真的想实时观看正在发生的事情,但这里的情况似乎并非如此。
就负载而言,在运行 Ubuntu 的空闲 PIII 900Mhz w/ 768MB RAM 上(不确定是哪个版本,但不超过一年),我每 0.5 秒更新一次,CPU 利用率约为 2%。 在 15 秒更新时,我看到 CPU 利用率为 0.1%。
根据您到底想要什么,您可以使用 uptime、free 和 ps 的输出来获取大部分(如果不是全部)top 信息。
I suspect 15 seconds would be more than adequate unless you actually want to watch what's happening in real time, but that doesn't appear to be the case here.
As far as load, on an idling PIII 900Mhz w/ 768MB of RAM running Ubuntu (not sure which version, but not more than a year old) I have top updating every 0.5 seconds and it's about 2% CPU utilization. At 15s updates, I'm seeing 0.1% CPU utilization.
depending upon what exactly you want, you could use the output of uptime, free, and ps to get most, if not all, of top's information.
如果您正在寻找总体负载,正常运行时间可能就足够了。 但是,如果您想要有关进程的特定信息,并且您喜欢冒险,并且启用了 /proc 文件系统,那么您可能需要编写自己的工具。 此环境的主要好处是您可以专注于您想要的事情并最大限度地减少系统的负载。
proc 文件系统为您的应用程序提供了对内核内存的读取访问权限,该内存跟踪许多有趣的变量。 从 /proc 读取是获取此信息的最简单的方法之一。 此外,您可能可以获得比顶部提供的更多信息。 我过去曾这样做过,以获取此过程在用户和系统上花费的时间。 此外,您可以使用它来获取有关进程打开的文件描述符数量的信息。 您还可以使用它来获取有关网络系统如何工作的详细信息。
其中大部分信息均由其他应用程序进行预处理,如果您获得所需的信息,则可以使用这些应用程序。 然而,阅读原始信息是相当简单的。 执行
man proc
获取更多信息。If you are looking for overall load, uptime is probably sufficient. However, if you want specific information about processes, you are adventurous, and have the /proc filessystem enabled, you may want to write your own tools. The primary benefit in this environment is that you can focus on exactly what you want and minimize the load introduced to the system.
The proc file system gives your application read access to the kernel memory that keeps track of many of the interesting variables. Reading from /proc is one of the lightest ways to get this information. Additionally, you may be able to get more information than provided by top. I've done this in the past to get amount of time spent in user and system by this process. Additionally, you can use this to get information about the number of file descriptors open by the process. You might also use this to get detailed information about how the network system is working.
Much of this information is pre-processed by other applications which can be used if you get the information you need. However, it is rather straight-forward to read the raw information. Do a
man proc
for more information.可惜你没有说你在监视什么。
Pity you haven't said what you are monitoring for.
在压力测试期间进行系统监控时,我们使用一个名为 nmon。
我喜欢 nmon 的一点是它能够导出到 XLS 并为您生成漂亮的图表。
它生成以下统计信息:
祝你好运:)
At work for system monitoring during stress tests we use a tool called nmon.
What I love about nmon is it has the ability to export to XLS and generate beautiful graphs for you.
It generates statistics for:
Good luck :)