衡量代码的可重用性
我正在尝试衡量我们组织中生成的代码有多少实际上是可重用的,并且我想制定一些准则。 我想对外部世界有一些参考:
在单个应用程序中通常会重用多少代码? 更具体地说 - 如果我们考虑一个完整的最终用户产品的所有代码(并最终排除第 3 方库),有多少函数和方法是从多个地方调用的?
使用哪些指标来衡量代码的可重用性? 是否有开源和/或闭源软件的可用数据或研究?
I am trying to measure how much of the code produced in our organisation is actually reusable and I'd like to set some guidelines.
I'd like to have some reference to outer world:
How much code is typically reused in a single application ?
More specifically - if we consider all the code of a complete end-user product (and eventually exclude 3rd party libraries), how many functions and methods are called from more than one place ?
What metrics are used to measure code reusability ?
Are there any available numbers or studies for opensource and/or closed source software ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
IMO 不存在“典型”应用程序,尤其是在这方面。应用程序具有截然不同的架构和执行流程,这导致了不同的“重用”模式。
考虑一个批量数据处理应用程序,它从特定格式的文件中读取数据,将其转换为另一种格式,然后保存。它实际上只有一个执行路径,因此不会从多个地方调用很多方法。
OTOH 考虑一个具有多个独立插件的插件框架,这些插件都使用相同的基础设施层,因此该层中的函数可以从许多不同的地方调用。
您确实不能说第一个应用程序的设计比第二个应用程序的设计更差(无需实际深入研究特定案例的细节)。
另请注意,第二种情况中的指标很棘手:如果您仅测量核心框架本身而不使用插件,则重用计数会较低,但使用实际插件时,重用计数会更高。由于插件可能是外部开发的,您甚至可能无法访问这些插件,因此您的指标将会出现偏差。
这就引出了另一点:重用可以在很多层面上发生。您可以在应用程序内或应用程序之间重用代码。后者只能通过考虑所有相关应用程序来衡量。
我认为更好的方法可能是从另一端开始,搜索重复的代码(例如,使用 PMD 等工具来查找 Java 代码)。如果您在很多地方都有大量重复的代码,则需要重构。
IMO there is no "typical" application, especially not in this respect. Apps have wildly different architectures and execution flows, which result in different patterns of "reuse".
Consider a batch data processing app, which reads data from a file in a specific format, converts it to another format, then saves it. It practically has a single execution path, so not many methods are called from more than one place.
OTOH consider a plugin framework with several independent plugins which all use the same infrastructure layer, so functions in that layer are called from many different places.
You can't really say the design of the first app is worse than that of the second app (without actually going into the case-specific details).
Note also that the metric in the 2nd case is tricky: if you measure only the core framework itself without plugins, you get a low reuse count, but with the actual plugins the reuse count is higher. Since the plugins may be externally developed, you may not even have access to these, so your metric will be skewed.
Which leads to another point: reuse can happen on many levels. You can reuse code within an app, or between apps. The latter can be measured only by taking into account all the apps in question.
I think a better approach to this might be to start from the other end, and search for duplicated code (e.g. using tools like PMD for Java code). If you have large chunks of duplicated code in lots of places, you need to refactor.
如果您将问题定义为“从多个地方调用了多少个函数”,那么从技术上讲,您可以构建一个静态分析器来回答该问题;这只是构建一个调用图并进行一些计数的问题(请参阅这个用于从 C、Java 和 COBOL 中提取调用图的工具)。实际上,您可能会发现您愿意做更多的工作来直接回答这个问题。
您可以考虑在代码库中运行克隆检测器。这将显示实际上已被复制和粘贴的人重用(并且应该以某种方式抽象出来)的代码,并提供精确的指标。像这样复制代码是最直接、最常见的重用形式。
我建造克隆探测器已经有大约十年了。几乎我遇到的每个系统,无论什么语言,都有 20% 的代码涉及克隆(--> 大约 10% 被重用)。我见过高达55%的例子。
If you define the question as to "how many functions are called from more than one place", you can technically build a static analyzer to answer that question; its just a matter of building a call graph and doing some counting (see this for a tool that extracts call graphs from C, Java and COBOL). As a practical matter, you might find that a lot more work that you are willing to do to answer this question directly.
You might consider running a clone detector across your code base. This will show code that has actually been reused (and should have been abstracted out somehow) by people doing copy and paste and provide precise metrics. Copying code like this is the most direct and common form of reuse.
I've been building clone detectors for about a decade. Almost every system I encounter, no matter what the langauge, has 20% of its code involved in clones (--> some 10% is reused). I've seen examples of up to 55%.
如果您正在使用 .NET 平台,请考虑使用 NDepend 为您提供有关您的许多指标软件。 “代码重用”不能直接作为衡量标准(可能是由于其他发帖者已经提到的原因),但您可能也会对耦合和内聚等内容感兴趣。
即使您不在 .net 上,指标定义也许也会有所帮助。
if you are working on a .NET platform, consider using NDepend to give you many metrics about your software. "Code reuse" is not avaibable as a metric directly (probably for reasons that other posters already mentioned), but things like coupling and cohesion might be of interest for you, too.
Even if you're not on .net, maybe the metrics definitions are helpful.