This is a well-known result in empirical software engineering that has been replicated and verified over and over again in countless studies. Which is very rare in software engineering, unfortunately: most software engineering "results" are basically hearsay, anecdotes, guesses, opinions, wishful thinking or just plain lies. In fact, most software engineering probably doesn't deserve the "engineering" brand.
Unfortunately, despite being one of the most solid, most scientifically and statistically sound, most heavily researched, most widely verified, most often replicated results of software engineering, it is also wrong.
The problem is that all of those studies do not control their variables properly. If you want to measure the effect of a variable, you have to be very careful to change only that one variable and that the other variables don't change at all. Not "change a few variables", not "minimize changes to other variables". "Only one" and the others "not at all".
In this particular case, they did not just measure in which phase (requirements, analysis, architecture, design, implementation, testing, maintenance) the bug was found, they also measured how long it stayed in the system. And it turns out that the phase is pretty much irrelevant, all that matters is the time. It's important that bugs be found fast, not in which phase.
This has some interesting ramifications: if it is important to find bugs fast, then why wait so long with the phase that is most likely to find bugs: testing? Why not put the testing at the beginning?
The problem with the "traditional" interpretation is that it leads to inefficient decisions. Because you assume you need to find all bugs during the requirements phase, you drag out the requirements phase unnecessarily long: you can't run requirements (or architectures, or designs), so finding a bug in something that you cannot even execute is freaking hard! Basically, while fixing bugs in the requirements phase is cheap, finding them is expensive.
If, however, you realize that it's not about finding the bugs in the earliest possible phase, but rather about finding the bugs at the earliest possible time, then you can make adjustments to your process, so that you move the phase in which finding bugs is cheapest (testing) to the point in time where fixing them is cheapest (the very beginning).
Note: I am well aware of the irony of ending a rant about not properly applying statistics with a completely unsubstantiated claim. Unfortunately, I lost the link where I read this. Glenn Vanderburg also mentioned this in his "Real Software Engineering" talk at the Lone Star Ruby Conference 2010, but AFAICR, he didn't cite any sources, either.
If anybody knows any sources, please let me know or edit my answer, or even just steal my answer. (If you can find a source, you deserve all the rep!)
Unfortunately the situation is as Jörg depicts, in fact somewhat worse: most of the references cited in this document strike me as bogus, in the sense that the paper cited either is not original research, or does not contain words supporting the claim being made, or - in the case of the 1998 paper about Hughes (p54) - contains measurements that in fact contradict what is implied by the curve in p42 of the presentation: different shape of the curve, and a modest x5 to x10 factor of cost-to-fix between the requirements phase and the functional test phase (and actually decreasing in system test and maintenance).
Never heard of it being called a pyramid before, and that seems a bit upside-down to me! Still, the central thesis is widely considered to be correct. just thick about it, the costs of fixing a bug in alpha stage are often trivial. By beta stage it might take a bit more debugging and user reports. After shipping it could be very expensive. a whole new version has to be created, you have to worry about breaking in-production code and data, there may also be lost sales due to the bug?
发布评论
评论(5)
修复软件错误的回报递减率令人难以置信
(Stefan Priebsh:OOP 和设计模式:Codeworks DC,2009 年 9 月)
The Incredible Rate of Diminishing Returns of Fixing Software Bugs
(Stefan Priebsh: OOP and Design Patterns: Codeworks DC in September 2009)
这是实证软件工程中众所周知的结果,已在无数研究中被反复复制和验证。不幸的是,这在软件工程中非常罕见:大多数软件工程“结果”基本上都是道听途说、轶事、猜测、意见、一厢情愿的想法或只是简单的谎言。事实上,大多数软件工程可能不值得贴上“工程”的标签。
不幸的是,尽管它是软件工程中最可靠、最科学、最可靠、最深入研究、最广泛验证、最常被复制的结果之一,但它也是错误的。
问题是所有这些研究都没有正确控制变量。如果您想衡量变量的影响,则必须非常小心地仅更改该一个变量,而其他变量则不更改完全不改变。不是“改变一些变量”,也不是“最小化对其他变量的改变”。 “只有一个”,其他的“根本没有”。
或者,用 Zed Shaw 的精彩话说:“如果你想测量一些东西,那么就不要测量其他的东西”。
在这种特殊情况下,他们不仅仅测量了在哪个阶段(需求、分析、架构、设计、实现、测试、维护)发现了错误,他们还测量了它在系统中停留的时间多长时间。事实证明,阶段几乎无关紧要,重要的是时间。重要的是快速发现错误,而不是在哪个阶段。
这有一些有趣的后果:如果快速发现错误很重要,那么为什么要在最有可能发现错误的阶段等待这么长时间:测试?为什么不从一开始就进行测试呢?
“传统”解释的问题在于它会导致低效的决策。因为您假设需要在需求阶段查找所有错误,所以您将需求阶段拖得过长:您无法运行需求(或架构或设计),因此查找< /em> 你甚至无法执行的东西中的错误是非常困难的!基本上,虽然在需求阶段修复错误的成本较低,但发现它们的成本却很高。
但是,如果您意识到这不是在最早的阶段发现错误,而是尽早发现错误,那么您可以进行调整到您的流程中,以便您将发现错误最便宜的阶段(测试)转移到修复错误最便宜的时间点(一开始)。
注:我很清楚以完全未经证实的主张来结束关于未正确应用统计数据的咆哮是多么讽刺。不幸的是,我丢失了阅读本文的链接。 Glenn Vanderburg 在他的《真正的软件工程中也提到了这一点”在 2010 年 Lone Star Ruby Conference 上的演讲,但 AFAICR,他也没有引用任何消息来源。
如果有人知道任何来源,请让我知道或编辑我的答案,甚至只是窃取我的答案。 (如果你能找到来源,你就值得所有的代表!)
This is a well-known result in empirical software engineering that has been replicated and verified over and over again in countless studies. Which is very rare in software engineering, unfortunately: most software engineering "results" are basically hearsay, anecdotes, guesses, opinions, wishful thinking or just plain lies. In fact, most software engineering probably doesn't deserve the "engineering" brand.
Unfortunately, despite being one of the most solid, most scientifically and statistically sound, most heavily researched, most widely verified, most often replicated results of software engineering, it is also wrong.
The problem is that all of those studies do not control their variables properly. If you want to measure the effect of a variable, you have to be very careful to change only that one variable and that the other variables don't change at all. Not "change a few variables", not "minimize changes to other variables". "Only one" and the others "not at all".
Or, in the brilliant Zed Shaw's words: "If you want to measure something, then don't measure other shit".
In this particular case, they did not just measure in which phase (requirements, analysis, architecture, design, implementation, testing, maintenance) the bug was found, they also measured how long it stayed in the system. And it turns out that the phase is pretty much irrelevant, all that matters is the time. It's important that bugs be found fast, not in which phase.
This has some interesting ramifications: if it is important to find bugs fast, then why wait so long with the phase that is most likely to find bugs: testing? Why not put the testing at the beginning?
The problem with the "traditional" interpretation is that it leads to inefficient decisions. Because you assume you need to find all bugs during the requirements phase, you drag out the requirements phase unnecessarily long: you can't run requirements (or architectures, or designs), so finding a bug in something that you cannot even execute is freaking hard! Basically, while fixing bugs in the requirements phase is cheap, finding them is expensive.
If, however, you realize that it's not about finding the bugs in the earliest possible phase, but rather about finding the bugs at the earliest possible time, then you can make adjustments to your process, so that you move the phase in which finding bugs is cheapest (testing) to the point in time where fixing them is cheapest (the very beginning).
Note: I am well aware of the irony of ending a rant about not properly applying statistics with a completely unsubstantiated claim. Unfortunately, I lost the link where I read this. Glenn Vanderburg also mentioned this in his "Real Software Engineering" talk at the Lone Star Ruby Conference 2010, but AFAICR, he didn't cite any sources, either.
If anybody knows any sources, please let me know or edit my answer, or even just steal my answer. (If you can find a source, you deserve all the rep!)
请参阅此演示文稿 (pdf) 的第 42 和 43 页。
不幸的是,情况正如 Jörg 所描述的那样,实际上更糟糕:本文件中引用的大多数参考文献在我看来都是伪造的,因为引用的论文要么不是原创研究,要么不包含支持所提出的主张的文字,或者 - 对于 1998 年关于 Hughes 的论文 (p54) - 包含测量值事实上与演示文稿第 42 页中的曲线所暗示的内容相矛盾:曲线的不同形状,以及需求阶段和功能测试之间修复成本的适度 x5 到 x10 系数阶段(实际上减少了系统测试和维护)。
See pages 42 and 43 of this presentation (pdf).
Unfortunately the situation is as Jörg depicts, in fact somewhat worse: most of the references cited in this document strike me as bogus, in the sense that the paper cited either is not original research, or does not contain words supporting the claim being made, or - in the case of the 1998 paper about Hughes (p54) - contains measurements that in fact contradict what is implied by the curve in p42 of the presentation: different shape of the curve, and a modest x5 to x10 factor of cost-to-fix between the requirements phase and the functional test phase (and actually decreasing in system test and maintenance).
以前从未听说过它被称为金字塔,这对我来说似乎有点颠倒!尽管如此,中心论点仍然被广泛认为是正确的。说实话,在 alpha 阶段修复 bug 的成本通常是微不足道的。到了测试阶段,可能需要更多的调试和用户报告。发货后可能会很贵。必须创建一个全新的版本,您必须担心破坏生产中的代码和数据,还可能由于错误而导致销售损失?
Never heard of it being called a pyramid before, and that seems a bit upside-down to me! Still, the central thesis is widely considered to be correct. just thick about it, the costs of fixing a bug in alpha stage are often trivial. By beta stage it might take a bit more debugging and user reports. After shipping it could be very expensive. a whole new version has to be created, you have to worry about breaking in-production code and data, there may also be lost sales due to the bug?
试试这篇文章。它使用“成本金字塔”论点(未命名)等。
Try this article. It uses the "cost pyramid" argument (no naming it), among others.