如何分析 Bash shell 脚本启动缓慢?
我的 Bash shell 需要 3-4 秒才能启动,而如果我使用 --norc
启动它,它会立即运行。
我开始通过手动插入 return
语句并寻求速度改进来“分析”/etc/bash.bashrc
和 ~/.bashrc
,但它这不是一个定量的过程,而且效率不高。
如何分析我的 Bash 脚本并查看哪些命令执行时间最长?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
如果您有 GNU
date
(或可以输出纳秒的其他版本),请在/etc/bash.bashrc
的开头(或您想要开始的任何地方)执行此操作任何 Bash 脚本中的跟踪):添加
到
~/.bashrc
的末尾(或者添加到您想要停止跟踪的任何 Bash 脚本部分的末尾)。\011
是八进制制表符。您应该在
/tmp/bashstart.PID.log
中获得跟踪日志,其中显示已执行的每个命令的秒数.纳秒时间戳。一次与下一次的差异是中间步骤所花费的时间。当您缩小范围时,您可以将
set -x
移到后面,将set +x
移到前面(或者有选择地将感兴趣的几个部分括起来)。虽然它不像 GNU date 的纳秒那样细粒度,但 Bash 5 包含一个以微秒为单位给出时间的变量。使用它可以让您不必为每一行生成一个外部可执行文件,并且可以在 Mac 或其他没有 GNU
date
的地方工作 - 当然,只要您有 Bash 5。更改PS4
的设置:正如 @pawamoy 所指出的,如果您有 Bash 4.1 或更高版本,您可以使用
BASH_XTRACEFD
将跟踪的输出发送到单独的文件描述符。来自 这个答案:这将导致跟踪输出转到文件
command.txt
中stdout
和stdout
正常输出(或单独重定向)。理解结果
我们可以使用 OpenOffice / Libre calc 查看每个命令所花费的时间以及运行总计,从而很容易发现在下面的示例中,lesspipe 花费了 6 毫秒。
=B2-B1
,然后将该单元格复制粘贴到整个 C 列。If you have GNU
date
(or another version that can output nanoseconds), do this at the beginning of/etc/bash.bashrc
(or wherever you'd like to begin a trace in any Bash script):add
at the end of
~/.bashrc
(or at the end of the section of any Bash script you'd like tracing to stop). The\011
is an octal tab character.You should get a trace log in
/tmp/bashstart.PID.log
that shows the seconds.nanoseconds timestamp of each command that was executed. The difference from one time to the next is the amount of time that the intervening step took.As you narrow things down, you can move
set -x
later andset +x
earlier (or bracket several sections of interest selectively).Although it's not as fine-grained as GNU
date
's nanoseconds, Bash 5 includes a variable which gives the time in microseconds. Using it saves you from spawning an external executable for every line and works on Macs or elsewhere that doesn't have GNUdate
- as long as you have Bash 5, of course. Change the setting ofPS4
:As pointed out by @pawamoy, you can use
BASH_XTRACEFD
to send the output of the trace to a separate file descriptor if you have Bash 4.1 or later. From this answer:This will cause the trace output to go to the file
command.txt
leavingstdout
andstdout
to be output normally (or be redirected separately).Making sense of the results
We can see the time each command took, as well as a running total, using OpenOffice / Libre calc, making it easy to spot that in the example below, lesspipe is taking 6ms.
=B2-B1
in cell C1, then copy paste that cell to the whole C column.分析 Bash(四个答案,还有更多...)
阅读本文,因为分析是一个重要的一步,我已经对整个堆栈溢出问题做了一些测试和研究,并已经发布了答案。
这里有四个以上的答案:
第一个是基于@DennisWilliamson的想法,但资源消耗少很多
第二个是我自己的(在此之前;)
第三个基于@fgm的答案,但更准确。
最后使用
script
、scriptreplay
和计时文件。最后,对最终的性能进行一些比较。
1. 使用
set -x
和date
但使用有限的分叉取自 @DennisWilliamson 的想法,但使用以下语法,将只有一个初始fork 为三个命令:
这样做只会运行
date
一次。有一个快速演示/测试来展示它是如何工作的:示例脚本:
通过运行此脚本,您可以创建 2 个文件:
/tmp/sample-XXXX.log
和/tmp/sample- XXXX.tim
(其中 XXXX 是正在运行的脚本的进程 ID)。您可以使用
paste
来呈现它们:或者您甚至可以计算差异时间:
或在两列上:
可以渲染:
2.使用
trap debug
和/proc/ 最近 GNU/Linux 内核上的timer_list
,没有分叉。在 GNU/Linux 的最新内核下,您可能会找到一个名为
timer_list
的/proc
文件:其中当前时间是
的总和5461935212966259 + 1383718821564493249,但以纳秒为单位。
因此,为了计算经过的时间,不需要知道偏移量。
对于此类工作,我写了 elap.bash (V2),由以下语法获取:
或
(请参阅完整语法的注释)
因此,您可以简单地在脚本顶部添加此行:
一个小示例:
Do render on my主机:
使用
trap2
而不是trap
作为 source 命令:将渲染两列最后一个命令和总计:
3. 使用 strace
是的,strace 可以完成这项工作:
但它可以制作很多东西!
使用更受限制的命令:
将转储更轻的日志:
根据您要搜索的内容,您可能会受到更多限制:
阅读它们会有点困难:
原始的 bash 脚本 并不那么容易4.
使用
script
、scriptreplay
和 定时文件对于调试脚本,我经常使用:
作为 的一部分>BSD Utils、
script
(和scriptreplay
)是一个非常古老的工具,可用于分析 bash,占用空间非常小。小演示:
将生成:
并生成两个文件:
文件
script.log
包含所有跟踪,script.tim
是计时文件:您可以使用日志文件的第一行和最后一行和/或通过汇总计时文件中的时间来查看总执行时间:
在计时文件中,第二个值是相应日志文件中的下一个字节数。这使您能够重放日志文件,可选地使用加速因子:
或
或
并排显示时间和命令也有点复杂:
进一步 :如何操作由“script”命令创建的计时和打字稿文件?
解决 scriptReplay问题 为了
进行测试,我在 Bash 复杂Hello, World 下载了第二个示例!。该脚本在我的主机上大约需要 0.72 秒才能完成。
我在脚本顶部添加了以下之一:
by
elap.bash
function<前><代码> #!/bin/bash
源 elap.bash-v2 trap2
eval "BUNCHS=(" $(perl <
通过
set -x
和PS4
<前><代码> #!/bin/bash
PS4='+ $(日期 "+%s.%N")\011 '
exec 3>&2 2>/tmp/bashstart.$$.log
设置-x
eval "BUNCHS=(" $(perl <
通过
set -x
和初始 fork 到长执行命令<前><代码> #!/bin/bash
执行3>&2 2> >(tee /tmp/sample-time.$$.log |
sed -u 's/^.*$/now/' | sed -u 's/^.*$/now/' |
日期 -f - +%s.%N >/tmp/sample-time.$$.tim)
设置-x
eval "BUNCHS=(" $(perl <
通过
script
(和set +x
)时间
并比较执行时间(在我的主机上):
输出
elap.bash
函数set -x
和PS4
set -x
和 初始 fork 到长执行命令(以及我的第二个paste
示例脚本),strace
script< /code>
结论
好吧!如果我的纯 Bash 脚本比在每个命令上分支到日期更快,那么使用它意味着对每个命令进行一些操作。
专用独立进程进行日志记录和存储的方式显然更有效。
strace是一种有趣的方式,更详细,但难以阅读。
script
,带有scriptreplay
和加速因子也非常好,但精度不一样,因为这是基于控制台交换而不是进程执行,但非常轻且高效(不同的目标,不同的用途)。最后,我认为在可读性和性能上更有效的是
set + 1 fork
,这个答案的第一个,但总的来说,根据具体情况,我有时也使用strace
和/或script
。更多技巧
当前正在运行脚本的哪一部分?
当您询问计算机(备份服务器或其他)当前发生的情况时,您可以使用
ps -C
命令,但为了显示当前经过的时间:或更好:
这显示根 cron 进程及其当前子进程和所有开始时间。
当前正在处理哪些文件或路径
在 Linux 上,您可以浏览
/proc
伪文件系统,用于显示当前由任何进程打开的文件描述符:当然,所有这些都必须是使用root权限完成(在root会话中或使用
sudo
)。Profiling Bash (four answers, and more...)
Reading this and because profiling is an important step, I've done some test and research about this whole Stack Overflow question and already posted answers.
Here are more than four answers:
The first is based on @DennisWilliamson's idea, but with a lot less of resource consumption
The second was my own (before this;)
The third is based on @fgm's answer, but more accurate.
The last uses
script
,scriptreplay
and timing file.Finally, a little comparison of performance at end.
1. Using
set -x
anddate
but with limited forksTake from @DennisWilliamson's idea, but with the following syntax, there will only one initial fork to three commands:
Doing this will run
date
only once. There is a quick demo/test to show how it work:Sample script:
By running this script, you make 2 files:
/tmp/sample-XXXX.log
and/tmp/sample-XXXX.tim
(where XXXX is process id of running script).You could present them by using
paste
:Or you may even compute diff time:
or on two columns:
May render:
2. Using
trap debug
and/proc/timer_list
on recent GNU/Linux kernels, without forks.Under GNU/Linux's recent kernels, you may find a
/proc
file namedtimer_list
:Where current time is the sum of
5461935212966259 + 1383718821564493249
, but in nanoseconds.So for computing elapsed time, there is no need of knowing offset.
For this kind of jobs, I wrote elap.bash (V2), that be sourced by the following syntax:
or
(See comments for the full syntax)
So you could simply add this line at top of your script:
A little sample:
Do render on my host:
Using
trap2
instead oftrap
as an argument to the source command:Will render two columns last command and total:
3. Using strace
Yes, strace could do the job:
But it could make a lot of stuff!
Using a more restricted command:
Will dump a lighter log:
Depending on what you're searching for, you may be more restrictive:
Reading them will be a little harder:
The original bash script is not so easy to follow in this...
4. Using
script
,scriptreplay
and timing fileFor debugging scripts, I often use:
As part of BSD Utils,
script
(andscriptreplay
) is a very old tool which can be used to profile bash, with a very small footprint.Little demo:
Will produce:
and generate two files:
File
script.log
contain all traces andscript.tim
is the timing file:You could see the total time execution with the first and last lines of the logfile and/or by summarizing times in the timing file:
In the timing file, the second value is the number of the next bytes in corresponding logfile. This let you have the ability of replaying the log file optionally with an acceleration factor:
or
or
Showing times and commands side-by-side is a little more complex too:
Further work around scriptReplay: How to manipulate timing and typescript files created by "script" command?
Tests and conclusion
To make tests, I've downloaded second sample at Bash complex Hello, World!. This script take approx 0.72 seconds to complete on my host.
I've add at the top of the script one of:
by
elap.bash
functionby
set -x
andPS4
by
set -x
and initial fork to long exec commandby
script
(andset +x
)Times
And compare execution times (on my host):
Outputs
elap.bash
functionset -x
andPS4
set -x
and initial fork to long exec command (and my secondpaste
sample script)strace
script
Conclusion
Well! If my pure Bash script is quicker than forking to date on each command, using this implies some operations on each command.
The way of dedicating an independent process for logging and storing is clearly more efficient.
strace is an interesting way, more detailed, but hard to read.
script
, withscriptreplay
and acceleration factor is very nice too, but not the same precision as this is based on a console exchange instead of process execution, but very light and efficient (not same goal, not same usage).Finally, I think that the more efficient, in readability and performances is
set + 1 fork
, The first of this answer, but in fine, depending on the specific case, I use sometimesstrace
and/orscript
too.Some more tricks
Which part of script are currently running?
As you are asking about what's currently happen in your machine (backup server or else), you could use
ps -C
command, but for showing currently elapsed time:or better:
This show the root cron process with his currents childs and all start times.
Which files or path are currently in progress
On linux, you could browse
/proc
pseudo filesystem, for showing file descriptors currently open by any process:Of course all this have to be done with root's rights (in a root session or by using
sudo
).它通常有助于跟踪系统调用
来自手册:
-c 计算每个系统调用的时间、调用和错误,并报告程序退出时的摘要。
-f 跟踪子进程...
这并不完全是您想要的,也不是面向行的分析器向您显示的内容,但它通常有助于找到热点。
It often helps to trace the system calls
From the manual:
-c Count time, calls, and errors for each system call and report a summary on program exit.
-f Trace child processes ...
This is not exactly what you want and what a line-oriented profiler would show to you but it usually helps to find hot spots.
您可以查看带有 DEBUG 条件的
trap
命令。有一种方法可以设置要与您的命令一起执行的命令。请参阅答案的注释。You may have a look at
trap
command with DEBUG condition. There is a way to set a command(s) to be executed along with your commands. See the notes to the answer.在脚本前面添加以下内容:
输出文件以毫秒为单位列出命令:
Add this in the front of the script:
The output file lists commands in millisecond scale:
这是一个非常笨的方法
如果启动在秒范围内
而且这是一个缓慢的命令。
运行 bash -x 或 bash -lx (以较慢者为准)
然后尽可能快地继续 bash 输入,直到 bash 完成初始化。
然后向后滚动并观察后面空行最多的命令。
那是你缓慢的命令。
Here's a very dumb method
if the startup is in the seconds range
and it's a single command that is slow.
Run
bash -x
orbash -lx
(whichever is slow)and then keep bashing the enter as quickly as you can until bash finishes initialization.
Then scroll back and observe the command which has the most empty lines afterwards.
That's your slow command.
Alan Hargreaves 的这篇文章介绍了分析 Bourne 的方法使用 DTrace 提供程序的 shell 脚本。据我所知,这适用于 Solaris 和 OpenSolaris (请参阅:/bin/sh DTrace 提供程序)。
因此,给出以下 DTrace 脚本 (
sh_flowtime.d< /code>
在 GH 基于原文):
你可以追踪函数流程包括增量时间。
示例输出:
然后使用
sort -nrk7
命令,您可以对输出进行排序以显示最消耗的调用。我不知道任何提供程序探针可用于其他 shell,因此请进行一些研究(GitHub 搜索?),或者如果您想投入一些时间,您可以根据现有的 sh 示例编写此类探针(请参阅如何激活 sh DTrace 提供程序?):
This post by Alan Hargreaves describes the method of profiling Bourne shell script using a DTrace provider. As far as I know, this works with Solaris and OpenSolaris (see: /bin/sh DTrace Provider).
So given the following DTrace script (
sh_flowtime.d
at GH based on the original):you can trace the function flow including delta times.
Sample output:
Then using
sort -nrk7
command, you may sort the output to show most consuming calls.I'm not aware of any provider probes availably for other shells, so do some research (GitHub search?) or if you want to invest some time, you can write such based on the existing sh example (see How to activate sh DTrace Provider?):
时间、xtrace、bash -x、
set -x
和set +x
(2.3. 调试 Bash 脚本)仍然是调试脚本的正统方法。尽管如此,为了拓宽我们的视野或更精确地控制资源的使用,可以检查 Linux 下可用的一些调试和分析系统 (这里是来自互联网的众多列表之一):例如,Valgrind ,专门用于内存调试,或 sysprof 来分析整个系统系统:
对于 sysprof:
然后,选择您感兴趣的子进程分支。
对于 Valgrind:
有了更多的gym,似乎可以让我们通常从二进制安装的一些程序对 Valgrind 可见(例如 OpenOffice)。
可以从 Valgrind 将分析的Valgrind 常见问题解答中阅读如果明确请求,则子进程。
它将在启用此选项的情况下执行此操作:
其他参考:
Callgrind 或此处,仍用作来自 CERN wiki
Time, xtrace, bash -x,
set -x
andset +x
(2.3. Debugging Bash scripts) remain the orthodox way to debug a script.Nevertheless, to broaden our horizon or to have more precise control over the use of resources, it is possible to check some debugging and profiling systems available under Linux (here one of the many lists from internet): for example, Valgrind, specifically for memory debugging, or sysprof to profile the whole system:
For sysprof:
And after, to select the branch of sub-processes that you find interesting.
For Valgrind:
With some more gym, it seems to be possible to make visible to Valgrind some programs that we usually install from binary (e.g. OpenOffice).
It is possible to read from the FAQ of Valgrind that Valgrind will profile the child processes if explicitly requested.
It will do it with this option enabled:
Additional references:
Callgrind or also here, still in use as reported from a wiki of CERN