代码度量有什么魅力?
我最近在 SO 上看到了一些与“代码度量”相关的问题,并且想知道它的魅力是什么? 以下是一些最近的示例:
在我看来,但是,没有任何指标可以替代代码审查:
- 某些指标有时可能表明需要审查的地方,而
- 短时间内指标的根本变化可能表明需要审查的地方
,但我想不出有哪个指标可以其本身总是指示“好”或“坏”代码 - 总是存在测量无法看到的异常和原因。
从我忽略的代码指标中是否可以获得一些神奇的见解? 懒惰的程序员/经理是否正在寻找不阅读代码的借口? 人们是否面临着巨大的遗留代码库并正在寻找一个起点? 这是怎么回事?
注意:我在答案和评论中针对特定主题提出了其中一些问题,但没有得到回复,所以我认为我应该向整个社区询问,因为也许我遗漏了一些东西。 运行指标批处理作业并且实际上不必再次阅读其他人的代码(或我自己的代码)会很好,我只是认为这不实用!
编辑:我熟悉大多数(如果不是全部)正在讨论的指标,我只是不认为它们是孤立的或任意的质量标准。
I've seen a number of 'code metrics' related questions on SO lately, and have to wonder what the fascination is? Here are some recent examples:
- what code metrics convince you that provided code is crappy
- when if ever is number of lines of code a useful metric
- writing quality tests
In my mind, no metric can substitute for a code review, though:
- some metrics sometimes may indicate places that need to be reviewed, and
- radical changes in metrics over short time frames may indicate places that need to be reviewed
But I cannot think of a single metric that by itself always indicates 'good' or 'bad' code - there are always exceptions and reasons for things that the measurements cannot see.
Is there some magical insight to be gained from code metrics that I've overlooked? Are lazy programmers/managers looking for excuses not to read code? Are people presented with giant legacy code bases and looking for a place to start? What's going on?
Note: I have asked some of these questions on the specific threads both in answers and comments and got no replies, so I thought I should ask the community in general as perhaps I am missing something. It would be nice to run a metrics batch job and not actually have to read other people's code (or my own) ever again, I just don't think it is practical!
EDIT: I am familiar with most if not all of the metrics being discussed, I just don't see the point of them in isolation or as arbitrary standards of quality.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(18)
这个帖子中的答案有点奇怪,因为他们谈到:
1/ 指标不适用于一个人群,而是针对三个:
当然,所有这些指标都可以由所有三个群体观察和分析,但每种指标都旨在为每个特定群体更好地使用。
2/ 指标本身代表代码的快照,这意味着......什么都没有!
这些指标的组合以及这些不同级别的分析的组合可能表明“好”或“坏”的代码,但更重要的是,它是这些指标的趋势 这很重要。
这些指标的重复将带来真正的附加值,因为它们将帮助业务经理/项目领导/开发人员确定优先级 其中
换句话说,您关于“度量的魅力”的问题可以指以下之间的区别:
因此,例如,一个圈复杂度为 9 的函数可以被定义为“美丽”,而不是一个圈复杂度为 42 的长卷积函数。
但是,如果:
有人可能会说:
所以,总结一下:
:不多,除了代码可能更“漂亮”,这本身并没有多大意义...
只有指标的组合和趋势才能提供您所追求的真正的“神奇洞察力”。
The answers in this thread are kind of odd as they speak of:
1/ Metrics is not for one population, but for three:
All those metrics can be watched and analyzed by all three populations of course, but each kind is designed to be better used by each specific group.
2/ Metrics, by themselves, represent a snapshot of the code, and that means... nothing!
It is the combination of those metrics, and the combinations of those different levels of analysis that may indicate a "good" or "bad" code, but more importantly, it is the trend of those metrics that is significant.
That is the repetition of those metrics what will give the real added value, as they will help the business managers/project leaders/developers to prioritize amongst the different possible code fixes
In other words, your question about the "fascination of metrics" could refer to the difference between:
So, for instance, a function with a cyclomatic complexity of 9 could be defined as "beautiful", as opposed of one long convoluted function of cyclomatic complexity of 42.
BUT, if:
one could argue:
So, to summarize:
: not much, except that the code may be more "beautiful", which in itself does not mean a lot...
Only the combination and trend of metrics give the real "magical insight" you are after.
一个月前,我有一个项目是作为一个人的工作来测量圈复杂度的。 那是我第一次接触此类指标。
我收到的第一份报告令人震惊。 我的几乎所有功能都未能通过测试,即使是(恕我直言)非常简单的功能。 我通过将逻辑子任务移动到子例程中来解决复杂性问题,即使它们只被调用一次。
对于另一半例程,我作为一名程序员的自豪感油然而生,我尝试以同样的方式重写它们,只是更简单、更具可读性。 这很有效,我能够最大限度地了解客户的循环复杂性阈值。
最后我几乎总能想出更好的解决方案和更简洁的代码。 性能并没有因此受到影响(相信我 - 我对此很偏执,并且我经常检查编译器输出的反汇编)。
我认为如果您将指标用作改进代码的理由/动机,那么指标是一件好事。 不过,重要的是要知道何时停止并请求度量违规补助。
度量是指南和帮助,而不是目的本身。
I had a project that I did as a one person job measured for cyclomatic complexity some month ago. That was my first exposure to these kind of metrics.
The first report I got was shocking. Almost all of my functions failed the test, even the (imho) very simple ones. I got around the complexity thing by moving logical sub-task into subroutines even if they have been called only once.
For the other half of the routines my pride as a programmer kicked in and I tried to rewrite them in a way that they do the same, just simpler and more readable. That worked and I was able to get most down to the customers yclomatic complexity threshold.
In the end I was almost always able to come up with a better solution and much cleaner code. The performance did not suffered from this (trust me - I'm paranoid on this, and I check the disassembly of the compiler output quite often).
I think metrics are a good thing if you use them as a reason/motivation to improve your code. It's imortant to know when to stop and ask for a metric violation grant though.
Metrics are guides and helps, not ends in itself.
我用过的最好的指标是废话分数。
基本上,它是一种将加权圈复杂度与自动测试覆盖率进行比较的算法。 该算法如下所示:
CRAP(m) = comp(m)^2 * (1 – cov(m)/100)^3 + comp(m)
其中 comp(m) 是方法 m 的圈复杂度,cov(m) 是自动化测试提供的测试代码覆盖率。
上述文章的作者(请去阅读它......它非常值得您花时间)建议最高 CRAP 分数为 30,其分解方式如下:
正如您很快看到的,该指标奖励编写不符合标准的代码。复杂性加上良好的测试覆盖率(如果你正在编写单元测试,你应该这样做,并且没有测量覆盖率......好吧,你可能也会喜欢向风吐口水)。 ;-)
对于我的大多数开发团队来说,我非常努力地让 CRAP 分数低于 8,但如果他们有充分的理由证明增加的复杂性是合理的,只要他们通过足够的测试覆盖了复杂性,那么这是可以接受的。 (编写复杂的代码总是很难测试......这是这个指标的一个隐藏的好处)。
大多数人最初发现很难编写能够通过 CRAP 分数的代码。 但随着时间的推移,他们编写了更好的代码、问题更少的代码以及更容易调试的代码。 在所有指标中,这是关注点最少且收益最大的指标。
The best metric that I have ever used is the C.R.A.P. score.
Basically it's an algorithm that compares weighted cyclomatic complexity with automated test coverage. The algorithm looks like this:
CRAP(m) = comp(m)^2 * (1 – cov(m)/100)^3 + comp(m)
where comp(m) is the cyclomatic complexity of method m, and cov(m) is the test code coverage provided by automated tests.
The authors of the afore mentioned article (please, go read it...it's well worth your time) suggest a max C.R.A.P. score of 30 which breaks down in the following way:
As you quickly see, the metric rewards writing code that is not complex coupled with good test coverage (if you are writing unit tests, and you should be, and are not measuring coverage...well, you probably would enjoy spitting into the wind as well). ;-)
For most of my development teams I tried really hard to get the C.R.A.P. score below 8, but if they had valid reasons to justify the added complexity that was acceptable as long as they covered the complexity with sufficient tests. (Writing complex code is always very difficult to test...kind of a hidden benefit to this metric).
Most people found it hard initially to write code that would pass the C.R.A.P. score. But over time they wrote better code, code that had fewer problems, and code that was a lot easier to debug. Out of any metric, this is the one that has the fewest concerns and the greatest benefit.
对我来说,识别不良代码的最重要的指标是圈复杂度。 我的项目中几乎所有方法都低于 CC 10,并且在 CC 超过 30 的旧方法中总是会发现错误。
高 CC 通常表示:
For me the single most important metric that identifies bad code is cyclomatic complexity. Almost all methods in my projects are below CC 10 and bugs are invariably found in legacy methods with CC over 30.
High CC usually indicates:
良好的代码审查不能替代良好的静态分析工具,这当然不能替代一套良好的单元测试,现在如果没有一套验收测试,单元测试是没有用的……
代码度量是另一个工具箱中的工具,它们本身并不是解决方案,它们只是适当使用的工具(当然还有您工具箱中的所有其他工具!)。
A good code review is no substitute for a good static analysis tool, which is of course not substitute for a good set of unit tests, now unit tests are no good without a set of acceptance tests......
Code metrics are another tool to put into your tool box, they are not a solution in their own right they are just a tool to be used as appropriate (with of course all the other tools in your box!).
人们被机械方式理解和描述代码的想法所吸引。 如果这是真的,请考虑一下对效率和生产力的影响!
我同意“代码品质”的衡量标准与“优秀散文”的衡量标准一样合理。 然而,这并不意味着指标毫无用处,只是可能被滥用了。
例如,某些指标的极端值指出了可能出现的问题。 1000 行长的方法可能无法维护。 单元测试代码覆盖率为零的代码可能比具有大量测试的类似代码有更多错误。 在发布之前添加到不是第三方库的项目中的代码的大幅跳跃可能会引起额外的关注。
我认为,如果我们使用指标作为建议——一个危险信号——也许它们会很有用。 问题是当人们开始用测试的 SLOC 来衡量生产力或用测试线的百分比来衡量质量时。
People are drawn to the idea of mechanistic ways to understand and describe code. If true, think of the ramifications for efficiency and productivity!
I agree that a metrics for "code goodness" is about as sensible as a metric for "good prose." However that doesn't mean metrics are useless, just perhaps misused.
For example, extreme values for some metrics point the way to possible problems. A 1000-line-long method is probably unmaintainable. Code with zero unit test code coverage probably has more bugs that similar code with lots of tests. A big jump in code added to a project just before release that isn't a third-party library is probably cause for extra attention.
I think if we use metrics as a suggestion -- a red flag -- perhaps they can be useful. The problem is when people start measuring productivity in SLOC or quality in percentage of lines with tests.
我非常主观的观点是,代码度量表达了对能够量化本质上无法量化的事物的不可抗拒的机构迷恋。
在某种程度上,至少在心理上是有道理的——你怎么能对你无法评估或理解的事情做出决定呢? 当然,最终,除非您对主题很了解(并且至少与您要评估的内容一样好),否则您无法评估质量,或者询问知识渊博的人,这当然只会让问题再次出现一步。
从这个意义上说,也许一个合理的类比是通过 SAT 成绩来评估大学入学者,这是不公平的,并且忽略了各种微妙之处,但如果你需要量化,你就必须做点什么。
并不是说我认为这是一个好的措施,只是我看到了它在制度上的不可抗拒性。 而且,正如您所指出的,可能有一些合理的指标(大量 500 多行方法,高复杂性 - 可能很糟糕)。 不过,我从来没有去过一个买这个的地方。
My highly subjective opinion is that code metrics expresses the irresistable institutional fascination with being able to quantify something inherently unquantifiable.
Makes sense, in a way, at least psychologically - how can you make decisions on something you can't evaluate or understand? Ultimately, of course, you can't evaluate quality unless you're knowledgeable about the subject (and are at least as good as the what you're trying to evaluate) or ask someone who's knowledgeable, which of course just puts the problem back one step.
In that sense, maybe a reasonable analogy would be evaluating college entrants by SAT scores, it's unfair and misses every kind of subtlety but if you need to quantify you've got to do something.
Not saying I think it's a good measure, only that I can see the intitutional irresistability of it. And, as you pointed out, there are probably a few reasonable metrics (lots of 500+ line methods , high complexity-probably bad). I've never been at a place that bought into this,though.
我相信一个代码指标。
我正在开发一个大型系统。 当一个新需求出现时,我就开始对其进行编码。 当我完成并解决了错误后,我将其签入版本控制系统。 该系统会进行比较,并计算我所做的所有更改。
该数字越小越好。
There's one code metric I believe in.
I'm working on a big system. When a single new requirement comes to me, I set about coding it up. When I'm done and got the bugs worked out, I check it into the version control system. That system does a diff, and counts up all the changes I made.
The smaller that number is, the better.
指标和自动化测试并不意味着取代完整的代码审查。
他们只是加快速度。 使用自动检查器,可以很容易地看到您忘记遵循哪些约定,您正在使用指定的包和方法等。您可以看到可以修复哪些内容,而无需占用其他人的时间。
经理们也喜欢衡量指标,因为他们觉得自己得到了生产力的准确数据(尽管事实往往并非如此),而且他们应该能够更好地与员工打交道。
Metrics and automated tests aren't meant to be a replacement for full code reviews.
They just speed things up. With an automated checker, it's very easy to see which conventions you've forgotten to follow, that you're using the designated packages and methods, etc. You can see what you can fix without using other people's time.
Managers also like metrics them because they feel they're getting an exact figure on productivity (though that's often not really the case) and they should be able to juggle people better.
测量只有在以下情况下才有用:
一般来说,任何不适合该领域的指标都会因团队对其进行优化而受到影响。 您想测量代码行数吗? 天哪,看看他们能写多少! 你想测量代码覆盖率,天哪,看我覆盖该代码!
我认为指标对于识别趋势很有用,事实上,我已经看到了一些有用的指标,例如在构建中断时绘制图表、代码改动(整个项目中更改的代码行数)和其他事情。 但如果团队没有提出这些建议,或者他们不同意或不理解这些建议,那么你可能会陷入痛苦的境地。
Measurements are only useful if:
In general, any metric that doesn't fit into that will suffer from the team optimizing to it. You want to measure lines of code? By gosh, watch how many they can write! You want to measure code coverage, by golly, watch me cover that code!
I think metrics can be useful for identifying trends, and in fact, I've seen some useful ones, such as plotting when the build breaks, code churn (number of lines of code changing throughout the project) and other things. But if the team isn't coming up with them, or they don't agree or understand them, you are likely in a world of hurt.
以下是来自 stan4j 的一些复杂性指标。
一个eclipse类结构分析工具。
我喜欢这个工具和指标。 我将指标视为统计数据、指标、警告消息。 有时由于某些方法或某些类确实有一些复杂的逻辑使它们变得复杂,要做的就是密切关注它们,
检查它们以查看是否需要重构它们或仔细检查它们,因为通常它们很容易出错。 我也用它作为学习源代码的分析工具,因为我喜欢从复杂到简单的学习。实际上它还包括一些其他指标,例如 Robert C. Martin Metrics、Chidamber 和 Chidamber 指标。 Kemerer 指标、计数指标
但我最喜欢这个
复杂度指标
循环复杂度指标
循环复杂度(CC)
方法的圈复杂度是方法的控制流图中决策点的数量加一。 决策点出现在 if/for/while 语句、case/catch 子句和类似的源代码元素处,其中控制流不仅仅是线性的。 由单个(源代码)语句引入的(字节代码)决策点的数量可以变化,这取决于例如布尔表达式的复杂性。 方法的圈复杂度值越高,测试该方法的控制流图的所有分支所需的测试用例就越多。
平均圈复杂度
应用程序、库、包树或包的所有方法的循环复杂度指标的平均值。
脂肪指标
工件的 Fat 度量是工件的适当依赖图中的边数。 依赖图类型取决于指标变体和所选工件:
Fat
应用程序、库或包树的 Fat 度量是其子树依赖图的边数。 该图包含包树层次结构中的所有工件子项,因此也包括叶包。 (要在“组合视图”中查看相应的图形,必须禁用“结构浏览器”的“扁平包”切换。如果所选工件是库,则必须启用“显示库”切换,否则必须禁用它。
)包是其单元依赖图的边数。 该图包含包的所有顶级类。
类的 Fat 度量是其成员图的边数。 该图包含该类的所有字段、方法和成员类。 (仅当使用详细级别成员而不是类执行代码分析时,此图和 Fat 值才可用。)
库依赖关系的 Fat(Fat - 库)
应用程序的库依赖关系度量标准是其库依赖关系图的边数。 该图包含应用程序的所有库。 (要在“组合视图”中查看相应的图形,必须启用“结构浏览器”的“显示库”开关。)
Fat for Flat Package Dependency(Fat - Packages)
应用程序的扁平包依赖关系度量标准是其扁平包依赖关系图的边缘计数。 该图包含应用程序的所有包。 (要在组合视图中查看相应的图形,必须启用 Structure Explorer 的 Flat Packages 开关,并禁用 Show Libraries 开关。)
库的 Fat for Flat Package 依赖项度量是其 Flat Package 的边缘计数依赖图。 该图包含该库的所有包。 (要在“组合视图”中查看相应的图形,必须启用“结构浏览器”的“扁平包”和“显示库”切换。)
顶级类依赖关系的 Fat(Fat - 单位)
应用程序或库的顶级类依赖关系度量标准是其单元依赖关系图的边数。 该图包含应用程序或库的所有顶级类。 (对于合理的应用程序来说,它太大而无法可视化,因此无法在组合视图中显示。单元依赖关系图可能仅针对包显示。)
Here is some Complexity Metrics from stan4j.
An eclipse class structure analyze tool.
I like this tool and the metrics. I treat the metrics as statistics, indicators, warning messages. Sometime due to some methods or some classes really has some complicated logic made them to be complex, what shall be done is keep an eye on them,
review them to see if there is an need to refactor them or review them carefully, due to normally they are error prone. Also I use it as analyze tool to learn source code, due to I like to learn from complex to simple.Actually it includes some other metrics such as Robert C. Martin Metrics, Chidamber & Kemerer Metrics,Count Metrics
But I like this one best
Complexity Metrics
Cyclomatic Complexity Metrics
Cyclomatic Complexity (CC)
The cyclomatic complexity of a method is the number of decision points in the method's control flow graph incremented by one. Decision points occur at if/for/while statements, case/catch clauses and similar source code elements, where the control flow is not just linear. The number of (byte code) decision points introduced by a single (source code) statement may vary, depending e.g. on the complexity of boolean expressions. The higher the cyclomatic complexity value of a method is, the more test cases are required to test all the branches of the method's control flow graph.
Average Cyclomatic Complexity
Average value of the Cyclomatic Complexity metric over all methods of an application, library, package tree or package.
Fat Metrics
The Fat metric of an artifact is the number of edges in an appropriate dependency graph of the artifact. The dependency graph type depends on the metric variant and the chosen artifact:
Fat
The Fat metric of an application, library or package tree is the edge count of its subtree dependency graph. This graph contains all the artifact's children in the package tree hierarchy, thereby also including leaf packages. (To see the appropriate graph in the Composition View, the Structure Explorer's Flat Packages toggle has to be disabled. The Show Libraries toggle has to be enabled if the chosen artifact is a library, otherwise it has to be disabled.)
The Fat metric of a package is the edge count of its unit dependency graph. This graph contains all top level classes of the package.
The Fat metric of a class is the edge count of its member graph. This graph contains all fields, methods and member classes of the class. (This graph and the Fat value are only available if the code analysis was performed with Level of Detail Member, not Class.)
Fat for Library Dependencies (Fat - Libraries)
The Fat for Library Dependencies metric of an application is the edge count of its library dependency graph. This graph contains all libraries of the application. (To see the appropriate graph in the Composition View, the Structure Explorer's Show Libraries toggle has to be enabled.)
Fat for Flat Package Dependencies (Fat - Packages)
The Fat for Flat Package Dependencies metric of an application is the edge count of its flat package dependency graph. This graph contains all packages of the application. (To see the appropriate graph in the Composition View, the Structure Explorer's Flat Packages toggle has to be enabled and the Show Libraries toggle has to be disabled.)
The Fat for Flat Package Dependencies metric of a library is the edge count of its flat package dependency graph. This graph contains all packages of the library. (To see the appropriate graph in the Composition View, the Structure Explorer's Flat Packages and Show Libraries toggles have to be enabled.)
Fat for Top Level Class Dependencies (Fat - Units)
The Fat for Top Level Class Dependencies metric of an application or library is the edge count of its unit dependency graph. This graph contains all the top level classes of the application or library. (For reasonable applications it is too large to be visualized and thus can not be displayed in the Composition View. Unit dependency graphs may only be displayed for packages.)
指标可能有助于确定项目的改进或退化,并且肯定可以发现风格和约定的违规行为,但没有什么可以替代同行代码审查。 没有它们,您不可能知道代码的质量。
哦......这假设代码审查中至少有一个参与者有线索。
Metrics may be useful to determine the improvement or degradation in a project, and can certainly find style and convention violations, but there is no substitute for doing peer code reviews. You can't possibly know the quality of your code without them.
Oh ... and this assumes that at least one of the participants in your code review has a clue.
我同意你的观点,即代码指标不应取代代码审查,但我相信它们应该补充代码审查。 我认为这又回到了那句老话:“你无法改进你无法衡量的东西。” 代码度量可以为开发团队提供可量化的“代码气味”或可能需要进一步调查的模式。 大多数静态分析工具中捕获的指标通常是在我们领域短暂历史的研究过程中确定的具有重要意义的指标。
I agree with you that code metrics should not substitute a code review but I believe that they should complement code reviews. I think it gets back to the old saying that "you cannot improve what you cannot measure." Code metrics can provide the development team with quantifiable "code smells" or patterns that may need further investigation. The metrics that are captured in most static analysis tools are typically metrics that have been identified over the course of research in our field's short history to have significant meaning.
指标并不能替代代码审查,但它们要便宜得多。 它们比任何东西都更重要。
Metrics are not a substitute for code review, but they're far cheaper. They're an indicator more than anything.
答案之一是,某些代码指标可以让您非常快速地初步回答以下问题:这段代码是什么样的?
即使是“代码行”也可以让您了解您正在查看的代码库的大小。
正如另一个答案中提到的,指标的趋势为您提供了最多的信息。
One part of the answer is that some code metrics can give you a very quick, initial stab at an answer to the question: What is this code like?
Even 'lines of code' can give you an idea of the size of the code base you are looking at.
As mentioned in another answer, the trend of the metrics gives you the most information.
指标本身并不是特别有趣。 重要的是你对它们所做的事情。
例如,如果您要测量每行代码的注释数量,您认为什么值比较好? 谁知道? 或者更重要的是,每个人都有自己的看法。
现在,如果您收集了足够的信息,能够将每行代码的注释数量与解决错误所需的时间或发现的归因于编码的错误数量相关联,那么您可能会开始找到一个在经验上有用的数字。
在软件中使用指标与在任何其他流程中使用任何其他绩效衡量标准之间没有区别 - 首先测量,然后分析,然后改进流程。 如果您所做的只是测量,那么您就是在浪费时间。
编辑:回应 Steven A. Lowe 的评论 - 这是绝对正确的。 在任何数据分析中,必须小心区分因果关系和纯粹的相关性。 根据适用性选择指标非常重要。 尝试测量咖啡消耗量和归因代码质量是没有意义的(尽管我确信有些人已经尝试过;-))
,但在找到关系(因果关系或非因果关系)之前,您必须拥有数据。
要收集的数据的选择取决于您希望验证或改进的流程。 例如,如果您尝试分析代码审查程序的成功(使用您自己对“成功”的定义,即减少错误或减少编码错误,或缩短周转时间或其他),那么您可以选择衡量的指标错误总数和已审查代码中的错误率。
因此,在收集数据之前,您必须知道您想用它做什么。 如果指标是手段,那么目的是什么?
Metrics of themselves are not particularly interesting. It's what you do with them that counts.
For example if you were measuring the number of comments per line of code what would you consider a good value? Who knows? Or perhaps more importantly, everyone has their own opinion.
Now if you collect enough information to be able to correlate the number of comments per line of code against the time taken to resolve a bugs or against the number of bugs found that are attributed to coding, then you may start to find an empirically useful number.
There is no difference between using metrics in software and using any other performance measure on any other process - first you measure, then you analyse, then you improve the process. If all you're doing is measuring, you're wasting your time.
edit: In response to Steven A. Lowe's comments - that's absolutely correct. In any data analysis one must be careful to distinguish between causal relationship and a mere correlation. And the selection of the metrics on the basis of suitability is important. There is no point in trying to measure coffee consumption and to attribute code quality (although I'm sure some have tried ;-) )
But before you can find the relationship (causal or not) you have to have the data.
The selection of the data to collect is based on what process you wish to verify or improve. For example, if you're trying to analyse the success of your code review procedures (using your own definition for "success", be that reduced bugs or reduced coding bugs, or shorter turnaround time or whatever), then you select metrics that measure the total rate of bugs and the rate of bugs in reviewed code.
So before you collect the data you have to know what you want to do with it. If metrics is the means, what is the end?
我认为指标上的小变化没有意义:复杂度为 20 的函数不一定比复杂度为 30 的函数更干净。但是值得运行指标来寻找大的差异。
有一次,我调查了几十个项目,其中一个项目的最大复杂度值约为 6,000,而其他所有项目的复杂度值约为 100 或更少。 这就像棒球棒一样击中了我的头。 显然,该项目发生了一些不寻常的事情,而且可能很糟糕。
I don't think small changes in metrics are meaningful: a function with complexity 20 is not necessarily cleaner than a function with complexity 30. But it's worth running metrics to look for large differences.
One time I was surveying a couple dozen projects and one of the projects had a maximum complexity value around 6,000 while every other project had a value around 100 or less. That hit me over the head like a baseball bat. Obviously something unusual, and probably bad, was going on with that project.
我们是程序员。 我们喜欢数字。
另外,您要做什么,而不是描述代码库的大小,因为“代码行指标不相关”?
举一个愚蠢的例子,150 行代码库和 1.5 亿行代码库之间肯定存在差异。 这并不是一个很难得到的数字。
We're programmers. We like numbers.
Also, what are you going to do, NOT describe the size of the codebase because "lines of code metrics are irrelevant"?
There is definitely a difference between a codebase of 150 lines and one of 150 million, to take a silly example. And it's not a hard number to get.