我收到了一个由几十个(可能超过 100 个,我没有数过)bash 脚本组成的项目。大多数脚本至少对另一个脚本进行一次调用。我想获得相当于调用图的东西,其中节点是脚本而不是函数。
有没有现有的软件可以做到这一点?
如果没有,有人对如何做到这一点有聪明的想法吗?
我能想到的最佳计划是枚举脚本并检查基本名称是否唯一(它们跨越多个目录)。如果存在重复的基本名称,请哭泣,因为脚本路径通常保存在变量名称中,因此您可能无法消除歧义。如果它们是唯一的,则 grep 脚本中的名称并使用这些结果来构建图表。使用一些工具(建议?)来可视化图表。
建议?
I've been handed a project that consists of several dozen (probably over 100, I haven't counted) bash scripts. Most of the scripts make at least one call to another one of the scripts. I'd like to get the equivalent of a call graph where the nodes are the scripts instead of functions.
Is there any existing software to do this?
If not, does anybody have clever ideas for how to do this?
Best plan I could come up with was to enumerate the scripts and check to see if the basenames are unique (they span multiple directories). If there are duplicate basenames, then cry, because the script paths are usually held in variable names so you may not be able to disambiguate. If they are unique, then grep the names in the scripts and use those results to build up a graph. Use some tool (suggestions?) to visualize the graph.
Suggestions?
发布评论
评论(3)
通过您的实现来包装 shell 本身,记录谁调用了您的包装器并执行原始 shell。
是的,您必须启动脚本才能确定真正使用了哪个脚本。否则,您需要一个与 shell 引擎本身具有相同知识的工具来支持整个变量扩展、路径等——我从未听说过这样的工具。
为了可视化调用图,请使用 GraphViz 的点格式。
Wrap the shell itself by your implementation, log who called you wrapper and exec the original shell.
Yes you have to start the scripts in order to identify which script is really used. Otherwise you need a tool with the same knowledge as the shell engine itself to support the whole variable expansion, PATHs etc -- I never heard about such a tool.
In order to visualize the calling graph use GraphViz's dot format.
这就是我最终的做法(免责声明:其中很多都是黑客行为,所以如果您打算长期使用它,您可能需要清理)...
假设:
- 当前目录包含所有有问题的脚本/二进制文件。
- 用于构建图表的文件位于子目录 call_graph 中。
创建了脚本 call_graph/make_tgf.sh:
然后,我运行了以下命令(我最终执行了仅脚本版本):
然后我在 yEd,并让 yEd 进行布局(布局 -> 分层)。我另存为 graphml 以将可手动编辑的文件与自动生成的文件分开。
我发现图中某些节点没有帮助,例如到处调用的实用程序脚本/二进制文件。因此,我从源/目标文件中删除了它们,并根据需要重新生成,直到我喜欢该节点集。
希望这对某人有帮助...
Here's how I wound up doing it (disclaimer: a lot of this is hack-ish, so you may want to clean up if you're going to use it long-term)...
Assumptions:
- Current directory contains all scripts/binaries in question.
- Files for building the graph go in subdir call_graph.
Created the script call_graph/make_tgf.sh:
Then, I ran the following (I wound up doing the scripts-only version):
I then opened the resulting tgf file in yEd, and had yEd do the layout (Layout -> Hierarchical). I saved as graphml to separate the manually-editable file from the automatically-generated one.
I found that there were certain nodes that were not helpful to have in the graph, such as utility scripts/binaries that were called all over the place. So, I removed these from the sources/targets files and regenerated as necessary until I liked the node set.
Hope this helps somebody...
在每个 shell 脚本的开头插入一行,位于 #!行,记录时间戳、脚本的完整路径名和参数列表。
随着时间的推移,您可以挖掘此日志来识别可能的候选者,即记录得非常接近的两行很有可能第一个脚本调用第二个脚本。
这也使您可以专注于仍在实际使用的脚本。
您可以使用 ed 脚本
并像这样运行它:
find / -perm +x -exec ed {}
确保使用 -print 而不是 exec 子句来测试 find 命令。并且 / 可能不是您想要使用的路径。如果您必须包含 bin 目录,那么您可能需要切换到 grep 来识别要包含的路径名,然后当您有一个充满正确名称的文件时,使用 xargs 而不是 find 来运行脚本。
Insert a line at the beginning of each shell script, after the #! line, which logs a timestamp, the full pathname of the script, and the argument list.
Over time, you can mine this log to identify likely candidates, i.e. two lines logged very close together have a high probability of the first script calling the second.
This also allows you to focus on the scripts which are still actually in use.
You could use an ed script
and run it like so:
find / -perm +x -exec ed {} <edscript
Make sure you test the find command with -print instead of the exec clause. And / is probably not the path that you want to use. If you have to include bin directories then you will probably need to switch to grep in order to identify the pathnames to include, then when you have a file full of the right names, use xargs instead of find to run the script.