突变测试在实践中有用吗?

发布于 2024-07-07 06:15:06 字数 88 浏览 6 评论 0原文

您有突变测试在现实生活中应用的例子吗? 它比简单的测试覆盖率工具效果更好吗? 还是说没有用呢?

现实世界中突变测试的优点/缺点是什么?

Do you have any examples of real life applications of mutation testing? Does it work better than simple test coverage tools? Or is it useless?

What are the advantages/disadvantages of mutation testing in the real world?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

埖埖迣鎅 2024-07-14 06:15:06

不再讨论单元测试的用处。 它们对于高质量应用程序的构想至关重要。 但是,我们如何评估它们的相关性?
代码覆盖率指标高达 100% 并不意味着代码经过 100% 测试。 这只是单元测试执行期间执行代码的视图。
突变测试将使您对自己的测试更有信心。

这是一个两步过程:

  1. 生成突变体。
  2. 检查测试是否发现了突变。

我写了一篇完整的文章这个过程包括一些具体的案例。

The usefulness of unit tests is no longer discussed. They are essential in conception of a quality application. But, how can we assess their relevance?
A code coverage indicator up to 100% doesn’t mean the code is 100% tested. This is just a view of executed code during unit tests execution.
Mutation testing will allow you to have more confidence in your tests.

This is a two step process:

  1. Generate mutants.
  2. Check that the mutations are found by the tests.

I wrote a entire article about this process, including some concrete cases.

命比纸薄 2024-07-14 06:15:06

不久前,我将突变测试视为检查自动回归测试脚本有效性的方法。 基本上,其中许多脚本缺少检查点,因此当它们正确运行正在测试的应用程序时,它们没有根据基线数据验证结果。 我发现比更改代码简单得多的方法是编写另一个应用程序以引入对基线副本的修改,然后针对修改后的基线重新运行测试。 在这种情况下,任何通过的测试要么是错误的,要么是不完整的。

这不是真正的突变测试,而是一种使用类似范式来测试测试脚本有效性的方法。 它实施起来很简单,而且 IMO 做得很好。

I looked at mutation test some time ago as a method for checking the efficacy of my automated regression testing scripts. Basically, a number of these scripts had missing checkpoints, so while they were exercising the application being tested correctly, they weren't verifying the results against the baseline data. I found that a far simpler method than changing the code was to write another application to introduce modifications to a copy of the baseline, and re-run the tests against the modified baseline. In this scenario, any test that passed was either faulty or incomplete.

This is not genuine mutation testing, but a method that uses a similar paradigm to test the efficacy of test scripts. It is simple enough to implement, and IMO does a good job.

醉生梦死 2024-07-14 06:15:06

我知道这是一个老问题,但最近鲍勃叔叔写了一篇关于变异测试的非常有趣的博客文章,可以帮助理解此类测试的有用性:

鲍勃叔叔变异测试博客文章

I known that this is a old question but recently Uncle Bob write a blog post very interesting about mutating testing that can help understand the usefully of this type of testing:

Uncle Bob mutating testing blog post

把回忆走一遍 2024-07-14 06:15:06

我已经用 Pitest 开发了一个小型的、人为的应用程序:

http://pitest.org/

这是一个 java 工具自动生成突变体。 您可以针对您的测试套件运行它,它会为您生成 HTML 报告,指示有多少突变体被杀死。 看起来非常有效并且不需要太多的努力来设置。 实际上,Java 世界中有很多不错的工具可以用于此类事情。 另请参阅:

http://www.eclemma.org/

了解相关报道。


我认为突变测试背后的概念是合理的。 这只是工具支持和意识的问题。 您正在传统代码覆盖率指标的简单性和该技术的额外复杂性之间进行权衡 - 这实际上只是取决于工具。 如果您可以生成突变体,那么它将有助于暴露测试用例中的弱点。 与您已经进行的测试相比,是否值得稍微增加一些工作量? 通过 Pitest,我确实发现它出现了看似不明显的测试用例。

突变测试是一种与单元/功能/集成测试方法完全不同的攻击角度。

  1. 您测试您的测试套件 - 这是整个测试程序的元测试。
  2. 它激发了您可能没有考虑过的其他测试用例。

I've played around with pitest for a small, contrived application:

http://pitest.org/

It's a java tool that automates mutant generation. You can run it against your test suite and it'll generate HTML reports for you indicating how many mutants were killed. Seemed quite effective and didn't require much effort to set up. There are actually quite a few nice tools in the Java world for this sort of thing. See also:

http://www.eclemma.org/

For coverage.


I think the concepts behind mutation testing are sound. It's just a matter of tool support and awareness. You're fighting a tradeoff between the simplicity of traditional code coverage metrics and additional complexity of this technique - it really just comes down to tools. If you can generate the mutants, then it will help expose weaknesses in your test cases. Is it worth the marginal increase in effort over the testing you already do? With pitest, I did find it turning up test cases that seemed non-obvious.

Mutation testing is an angle of attack that's quite different from the unit/functional/integration testing methodologies.

  1. You test your test suite - it's a meta-test of your whole testing program.
  2. It inspires additional test cases you might not have otherwise considered.
空心↖ 2024-07-14 06:15:06

突变测试帮助我识别测试用例断言的问题。

例如,当您收到一份报告说“测试用例 x 没有杀死任何突变体”时,您看一下,结果发现该断言已被注释掉。

根据本文,Google 的开发人员使用 Mutation 测试作为补充代码审查和拉取请求检查。 他们似乎对结果很满意:

开发人员决定重新设计大块代码以使它们可测试,以便可以杀死突变体,他们在查看突变体的复杂逻辑表达式中发现了错误,他们决定删除具有等效突变体的代码,因为他们认为它这是一个过早的优化,他们声称突变体为他们节省了数小时的调试时间,甚至生产中断,因为没有测试用例正确覆盖突变下的逻辑。 变异测试被称为多年来代码审查验证中最好的改进之一。 虽然这种反馈很难量化,但结合成千上万愿意检查其代码更改中出现的突变的开发人员的数量,就可以得出结论。

Mutation testing has helped me identify problems with test case assertions.

For example, when you get a report that says "no mutant has been killed by test case x", you take a look, and it turns out the assertion had been commented out.

According to this paper, developers at Google use Mutation testing as a complement to code-review and pull-request inspections. They seem happy about the results:

Developers have decided to redesign large chunks of code to make them testable just so a mutant could be killed, they have found bugs in complex logical expressions looking at mutants, they have decided to remove code with an equivalent mutant because they deemed it a premature optimization, they’ve claimed the mutant saved them hours of debugging and even production outages because no test cases were covering the logic under mutation properly. Mutation testing has been called one of the best improvements in the code review verification in years. While this feedback is hardly quantifiable, combined with the sheer number of thousands of developers willing to inspect surfaced mutants on their code changes makes a statement.

梦归所梦 2024-07-14 06:15:06

我最近做了一些关于突变测试的研究。 结果在这里:

http://abeletsky。 blogspot.com/2010/07/using-of-mutation-testing-in-real.html

简而言之:突变测试可以提供一些有关源代码和测试质量的信息,但它并不是直接使用的东西。

I recently did some investigations on mutation testing. Results are here:

http://abeletsky.blogspot.com/2010/07/using-of-mutation-testing-in-real.html

In short: mutation testing could give some information about quality of source code and tests, but it is not something straighforward to use.

冷了相思 2024-07-14 06:15:06

突变测试在两种特殊类型的项目中为我提供了帮助:

  • 我自己开发的小型库:我使用突变测试来测试测试的质量。 我发现即使进行“严格的 TDD”,我也有幸存的突变体。 它帮助我理解了我的测试风格中的一些反模式。 我什至将突变测试分析作为 CI 的一部分(仅在合并到主分支时)。 但我可以做到这一点,因为该库很小并且具有零依赖性。 代码简单、快速,所有测试都是单元测试(总共约 300 个)。

  • 由初级团队编写的微服务:我是该项目的技术负责人,我怀疑解决方案的质量不好,而突变分析证实了该假设。 该团队缺乏编写测试的经验,并且错过了很多案例。 通过展示报告以及我们项目中的具体突变位置,我能够让经理和开发人员相信我们的工作质量。

在这些项目中,我使用了 Stryker(用于 JS 和 TS),我对结果很满意。 它帮助我向不了解突变测试的人展示了突变测试的工作原理。

由于生成大量突变是一项相当占用 CPU 资源的任务,因此您不能一直执行此操作(例如运行测试以获取即时反馈),但您可以在最后完成功能/错误修复/更改后执行此操作 -提交代码之前进行分钟检查。 或者,如果您处于重构冲刺/阶段,那么这也是运行该工具的好时机。

它对于具有缓慢且耦合测试的广泛 Rails 应用程序没有帮助。 基本上,我尝试运行突变测试工具的每次尝试最终都会崩溃或返回大量难以处理的数据。 在这种情况下,我对代码的关键部分手动进行了突变测试(手动生成突变)。 但这种方法很容易受到你自己偏见的影响(你如何选择一个“好的”突变体?)。

与测试覆盖率相比,我倾向于说测试覆盖率是一个数量指标(它表示测试命中了多少代码),而突变分数是一个质量指标(它表示您的代码出现由更改引起的错误的可能性有多大) )。

Mutation testing has helped me in two particular kinds of projects:

  • Small library developed by myself: I used mutation testing to test the quality of my tests. I discovered that even doing "strict TDD", I had surviving mutants. It helped me understand some anti-patterns in my testing style. I even included mutation testing analysis as part of CI (only when merging to the main branch). But I could do that because the library is tiny and had zero dependencies. The code was simple and fast and all tests were unit tests (about ~300 in total).

  • Microservice written by a junior team: I was the tech lead on that project, and I suspected the quality of the solution was not good, and the mutation analysis confirmed that hypothesis. The team had little experience writing tests and they missed a lot of cases. I was able to convince managers and developers about the quality of our work by showing the reports and where exactly the mutations were in our project.

In those projects, I've used Stryker (for JS and TS) and I was happy with the results. It helped me show how mutation testing works to people that didn't know about it.

As generating tons of mutations is a pretty CPU-intensive task, it's not something you can do all the time (like running the tests to get immediate feedback), but you can do it after finishing your feature/bugfix/change as a last-minute check before submitting the code. Or if you are in a refactoring sprint/phase, it's a good time to run the tool as well.

It was not helpful in an extensive Rails application that had slow and coupled tests. Basically, every attempt I tried to run a mutation testing tool ended up crashing or returning a huge amount of data that was hard to process. In that case, I did mutation testing manually (generating the mutations by hand) on the critical parts of the code. But this approach is very influenced by your own biases (how do you choose a "good" mutant?).

Compared to test coverage, I tend to say that test coverage is a quantity metric (it says how much code is hit by a test), and mutation score is a quality metric (it says how likely is your code to have bugs caused by changes).

笑忘罢 2024-07-14 06:15:06

覆盖率与突变测试。 这是一个老问题,但我最近看到了一篇关于该主题的博客。 很有主见。 但覆盖范围和突变测试之间的差异是明确阐明的。

https://pedrorijo.com/blog/intro-mutation/

我自己的经验表明Pitest 非常有用,但由于运行时爆炸,它只能运行一个非常快的测试集。 实际上,这限制了我应用突变测试的地方。

Coverage vs mutation testing. An old question, but I recently came across a recent blog on the topic. Pretty opinionated. But the differences between coverage and mutation testing is clearly articulated.

https://pedrorijo.com/blog/intro-mutation/

My own experience shows that Pitest is pretty useful, but since the runtime explodes it works only one very fast test sets. In practice this limits where I apply mutation testing.

2024-07-14 06:15:06

由于上述突变,第一个测试用例的行为有所不同,现在出现了异常。 因此它不会返回预期的数组 {6,3}。 然而,我们的第二个测试用例保持不变,因为它也包含正数。 因此,它也给出了正数的例外。 现在,如果我们必须编写一个成功的测试用例,那就是
输入 ={-6,-6,-7,-3,-4}
预期 = {-6,-3}

The test case for the first one behaves differently due to above mutation there is an exception raised now. So it doesn’t returns the expected array of {6,3}. However, our second test case remains same, because it also includes positive number. So, it gives exception on positive numbers as well. Now, if we have to write a successful test case that would be
Input ={-6,-6,-7,-3,-4}
Expected = {-6,-3}

谜兔 2024-07-14 06:15:06

我使用 https://stryker-mutator.io 在 Angular 上设置突变测试/docs/stryker-js/guides/angular/ 只是为了进行实验,花了 2 个小时才获得单个代码文件的报告。 也就是说,我对使用 Stryker 和 .NET 的体验感到非常满意。 我必须承认我对突变测试相当陌生,可能有更好的工具可以与 Angular/karma 一起使用,但性能是需要牢记的,特别是如果您计划将其与 TDD 结合使用。

I set up mutation testing on Angular using https://stryker-mutator.io/docs/stryker-js/guides/angular/ simply to experiment and it took 2 hours to get a report for a single code file. That said, I was very happy with the experience of using Stryker with .NET. I must admit I am fairly new to mutation testing and there might be better tools that work with Angular/karma but performance is something to keep in mind especially if you plan to use it in conjunction with TDD.

蓝海似她心 2024-07-14 06:15:06

如果您接受

a) 单元测试是必要的

b) 需要测量单元测试的有效性。 (这就是代码覆盖率试图做到的。)

c)代码覆盖率本身就是一个有限的指标

d)单元测试本身应该被测试,即在适当的场景中显示失败(注意,这就是 TDD 红绿-重构试图实现)

那么你必须接受突变测试是必要的。

If you accept that

a) Unit tests are necessary

b) Measuring efficacy of unit tests is needed. (This is what code coverage tries to do.)

c) That code coverage alone is a limited metric

d) That a unit test should itself be tested, I.e. be shown fail in appropriate scenarios (note, this is what TDD Red-Green-Refactor tries to achieve)

Then you must accept that mutation testing is necessary.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文