使用Python代码覆盖工具来理解和修剪大型库的源代码

发布于 2024-09-26 16:30:45 字数 631 浏览 10 评论 0原文

我的项目针对低成本和低资源的嵌入式设备。我依赖于一个相对庞大且庞大的 Python 代码库，我对其 API 的使用非常具体。

我热衷于将该库的代码修剪到最低限度，方法是在 Ned Batchelder 的 coverage 或 figleaf 等覆盖工具中执行我的测试套件，然后编写脚本删除各个模块/文件中未使用的代码。这不仅有助于理解库的内部结构，而且还使编写任何补丁变得更容易。内德实际上在他的一次在线演讲中提到了使用覆盖工具对复杂代码进行“逆向工程”。

我向 SO 社区提出的问题是，人们是否有以这种方式使用覆盖工具的经验并且不介意分享？如果有的话，有哪些陷阱？ 覆盖率工具是一个不错的选择吗？或者我把时间花在 figleaf 上会更好吗？

最终目标是能够基于原始树自动为库生成一个新的源代码树，但仅包括当我运行nosetests时实际使用的代码。

如果有人开发了一种工具可以为他们的 Python 应用程序和库完成类似的工作，那么获得一个开始开发的基线将是非常棒的。

希望我的描述对读者有意义......

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

骑趴 2024-10-03 16:30:45

您想要的不是“测试覆盖率”，而是来自计算根的“可以调用”的传递闭包。（在线程应用程序中，您必须包含“can fork”）。

您想要指定一些构成应用程序入口点的小函数集（可能只有 1 个），并且想要跟踪该小集合的所有可能的被调用者（有条件或无条件）。这是您必须拥有的一组功能。

由于动态调度，尤其是“eval”，Python 使得这一切变得非常困难（IIRC，我不是一个深入的 Python 专家）。对于应用于高度动态语言的静态分析器来说，推理可以调用哪些函数可能非常棘手。

人们可以使用测试覆盖率作为一种方式，将“可以调用”关系与特定的“确实调用”事实联系起来；这可以捕获大量动态调度（取决于您的测试套件覆盖范围）。那么你想要的结果是“can or did”调用的传递闭包。这仍然可能是错误的，但可能不会那么严重。

一旦您获得了一组“必要”函数，下一个问题将是从源文件中删除不必要的函数。如果您开始使用的文件数量很大，则删除无用文件的手动工作量可能会相当高。更糟糕的是，您可能会修改您的申请，然后关于保留哪些内容的答案会发生变化。因此，对于每次更改（发布），您都需要可靠地重新计算这个答案。

我的公司构建了一个工具，可以对 Java 包进行这种分析（带有有关动态加载和反射的适当警告）：输入是一组 Java 文件和（如上所述）一组指定的根函数。该工具计算调用图，还查找所有死成员变量并生成两个输出：a）据称死方法和成员的列表，b）删除所有“死”内容的修订后的文件集。如果你相信a)，那么你就使用b)。如果您认为 a) 是错误的，则将 a) 中列出的元素添加到根集合中并重复分析，直到您认为 a) 是正确的为止。为此，您需要一个静态分析工具来解析 Java、计算调用图，然后修改代码模块以删除死条目。基本思想适用于任何语言。

我预计您需要一个类似的 Python 工具。

也许您可以坚持只删除完全未使用的文件，尽管这可能仍然需要大量工作。

What you want isn't "test coverage", it is the transitive closure of "can call" from the root of the computation. (In threaded applications, you have to include "can fork").

You want to designate some small set (perhaps only 1) of functions that make up the entry points of your application, and want to trace through all possible callees (conditional or unconditional) of that small set. This is the set of functions you must have.

Python makes this very hard in general (IIRC, I'm not a deep Python expert) because of dynamic dispatch and especially due to "eval". Reasoning about what function can get called can be pretty tricky for a static analyzers applied to highly dynamic languages.

One might use test coverage as a way to seed the "can call" relation with specific "did call" facts; that could catch a lot of dynamic dispatches (dependent on your test suite coverage). Then the result you want is the transitive closure of "can or did" call. This can still be erroneous, but is likely to be less so.

Once you get a set of "necessary" functions, the next problem will be removing the unnecessary functions from the source files you have. If the number of files you start with is large, the manual effort to remove the dead stuff may be pretty high. Worse, you're likely to revise your application, and then the answer as to what to keep changes. So for every change (release), you need to reliably recompute this answer.

My company builds a tool that does this analysis for Java packages (with appropriate caveats regarding dynamic loads and reflection): the input is a set of Java files and (as above) a designated set of root functions. The tool computes the call graph, and also finds all dead member variables and produces two outputs: a) the list of purportedly dead methods and members, and b) a revised set of files with all the "dead" stuff removed. If you believe a), then you use b). If you think a) is wrong, then you add elements listed in a) to the set of roots and repeat the analysis until you think a) is right. To do this, you need a static analysis tool that parse Java, compute the call graph, and then revise the code modules to remove the dead entries. The basic idea applies to any language.

You'd need a similar tool for Python, I'd expect.

Maybe you can stick to just dropping files that are completely unused, although that may still be a lot of work.

回复收藏 0 原文