非关键单元测试失败

发布于 2024-08-03 23:34:51 字数 479 浏览 7 评论 0原文

我正在使用 Python 的内置 unittest 模块,我想编写一些测试并不重要。

我的意思是,如果我的程序通过了这样的测试,那就太好了!但是,如果没有通过,也不是真正的问题,程序仍然可以运行。

例如,我的程序设计为使用自定义类型“A”。如果它不能与“A”一起工作,那么它就坏了。然而,为了方便起见,其中大部分也应该与另一种类型“B”一起使用,但这不是强制性的。如果它无法与“B”一起工作,那么它并没有损坏(因为它仍然可以与“A”一起工作,这是它的主要目的)。未能与“B”一起工作并不重要,我只是会错过我可以拥有的“奖励功能”。

另一个(假设的)例子是编写 OCR 时。该算法应该可以识别测试中的大多数图像,但如果其中一些图像失败也没关系。 (不,我不是在写 OCR)

有没有办法在单元测试(或其他测试框架)中编写非关键测试?

I'm using Python's built-in unittest module and I want to write a few tests that are not critical.

I mean, if my program passes such tests, that's great! However, if it doesn't pass, it's not really a problem, the program will still work.

For example, my program is designed to work with a custom type "A". If it fails to work with "A", then it's broken. However, for convenience, most of it should also work with another type "B", but that's not mandatory. If it fails to work with "B", then it's not broken (because it still works with "A", which is its main purpose). Failing to work with "B" is not critical, I will just miss a "bonus feature" I could have.

Another (hypothetical) example is when writing an OCR. The algorithm should recognize most images from the tests, but it's okay if some of them fails. (and no, I'm not writing an OCR)

Is there any way to write non-critical tests in unittest (or other testing framework)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

冬天旳寂寞 2024-08-10 23:34:51

实际上,我可能会使用打印语句来指示这种情况下的失败。更正确的解决方案是使用警告:

http://docs.python.org/library/warnings .html

但是,您可以使用日志记录工具生成更详细的测试结果记录(即设置“B”类故障以将警告写入日志)。

http://docs.python.org/library/logging.html

编辑 我们在 Django 中处理这个问题的方式是,我们有一些预计会失败的测试,并且根据环境跳过其他一些测试。由于我们通常可以预测测试是否应该失败或通过(即,如果我们无法导入某个模块,则系统没有它,因此测试将无法工作),因此我们可以智能地跳过失败的测试。这意味着我们仍然运行每个将通过的测试,并且没有“可能”通过的测试。当单元测试以可预测的方式做事时,它们是最有用的,并且能够在运行测试之前检测测试是否应该通过,从而使这成为可能。

As a practical matter, I'd probably use print statements to indicate failure in that case. A more correct solution is to use warnings:

http://docs.python.org/library/warnings.html

You could, however, use the logging facility to generate a more detailed record of your test results (i.e. set your "B" class failures to write warnings to the logs).

http://docs.python.org/library/logging.html

Edit:

The way we handle this in Django is that we have some tests we expect to fail, and we have others that we skip based on the environment. Since we can generally predict whether a test SHOULD fail or pass (i.e. if we can't import a certain module, the system doesn't have it, and so the test won't work), we can skip failing tests intelligently. This means that we still run every test that will pass, and have no tests that "might" pass. Unit tests are most useful when they do things predictably, and being able to detect whether or not a test SHOULD pass before we run it makes this possible.

初雪 2024-08-10 23:34:51

单元测试中的断言是二元的:它们要么起作用,要么失败,没有中期。

鉴于此,要创建这些“非关键”测试,当您不希望测试失败时,不应使用断言。您应该小心地进行此操作,以免损害测试的“有用性”。

我对您的 OCR 示例的建议是,您使用某种东西来记录测试代码中的成功率,然后创建一个断言,例如:“assert success_rate > 8.5”,这应该会产生您想要的效果。

Asserts in unit tests are binary: they will work or they will fail, there's no mid-term.

Given that, to create those "non-critical" tests you should not use assertions when you don't want the tests to fail. You should do this carefully so you don't compromise the "usefulness" of the test.

My advice to your OCR example is that you use something to record the success rate in your tests code and then create one assertion like: "assert success_rate > 8.5", and that should give the effect you desire.

江挽川 2024-08-10 23:34:51

感谢您的精彩回答。没有一个答案是真正完整的,所以我在这里写所有对我有帮助的答案的组合。如果您喜欢这个答案,请投票给对此负责的人。

结论

单元测试(或者至少是 unittest 模块中的单元测试)是二进制的。正如 Guilherme Chapiewski 所说它们要么有效,要么失败,没有中期。

因此,我的结论是单元测试并不是这项工作的正确工具。似乎单元测试更关心“让一切正常工作,不会出现故障”,因此我不能(或者说不容易)进行非二进制测试。

因此,如果我试图改进算法或实现,单元测试似乎不是正确的工具,因为单元测试无法告诉我一个版本与另一个版本相比有多好(假设它们都正确实现) ,那么两者都会通过所有单元测试)。

我的最终解决方案

我的最终解决方案基于ryber的想法 和 wcoenen 答案 中显示的代码。我基本上扩展了默认的 TextTestRunner 并使其不那么冗长。然后,我的主代码调用两个测试套件:关键测试套件使用标准 TextTestRunner,非关键测试套件使用我自己的不太详细的版本。

class _TerseTextTestResult(unittest._TextTestResult):
    def printErrorList(self, flavour, errors):
        for test, err in errors:
            #self.stream.writeln(self.separator1)
            self.stream.writeln("%s: %s" % (flavour,self.getDescription(test)))
            #self.stream.writeln(self.separator2)
            #self.stream.writeln("%s" % err)


class TerseTextTestRunner(unittest.TextTestRunner):
    def _makeResult(self):
        return _TerseTextTestResult(self.stream, self.descriptions, self.verbosity)


if __name__ == '__main__':
    sys.stderr.write("Running non-critical tests:\n")
    non_critical_suite = unittest.TestLoader().loadTestsFromTestCase(TestSomethingNonCritical)
    TerseTextTestRunner(verbosity=1).run(non_critical_suite)

    sys.stderr.write("\n")

    sys.stderr.write("Running CRITICAL tests:\n")
    suite = unittest.TestLoader().loadTestsFromTestCase(TestEverythingImportant)
    unittest.TextTestRunner(verbosity=1).run(suite)

可能的改进

了解是否有任何非二进制测试的测试框架应该仍然有用,例如凯西·范·斯通建议。也许我不会在这个简单的个人项目中使用它,但它可能对未来的项目有用。

Thank you for the great answers. No only one answer was really complete, so I'm writing here a combination of all answers that helped me. If you like this answer, please vote up the people who were responsible for this.

Conclusions

Unit tests (or at least unit tests in unittest module) are binary. As Guilherme Chapiewski says: they will work or they will fail, there's no mid-term.

Thus, my conclusion is that unit tests are not exactly the right tool for this job. It seems that unit tests are more concerned about "keep everything working, no failure is expected", and thus I can't (or it's not easy) to have non-binary tests.

So, unit tests don't seem the right tool if I'm trying to improve an algorithm or an implementation, because unit tests can't tell me how better is one version when compared to the other (supposing both of them are correctly implemented, then both will pass all unit tests).

My final solution

My final solution is based on ryber's idea and code shown in wcoenen answer. I'm basically extending the default TextTestRunner and making it less verbose. Then, my main code call two test suits: the critical one using the standard TextTestRunner, and the non-critical one, with my own less-verbose version.

class _TerseTextTestResult(unittest._TextTestResult):
    def printErrorList(self, flavour, errors):
        for test, err in errors:
            #self.stream.writeln(self.separator1)
            self.stream.writeln("%s: %s" % (flavour,self.getDescription(test)))
            #self.stream.writeln(self.separator2)
            #self.stream.writeln("%s" % err)


class TerseTextTestRunner(unittest.TextTestRunner):
    def _makeResult(self):
        return _TerseTextTestResult(self.stream, self.descriptions, self.verbosity)


if __name__ == '__main__':
    sys.stderr.write("Running non-critical tests:\n")
    non_critical_suite = unittest.TestLoader().loadTestsFromTestCase(TestSomethingNonCritical)
    TerseTextTestRunner(verbosity=1).run(non_critical_suite)

    sys.stderr.write("\n")

    sys.stderr.write("Running CRITICAL tests:\n")
    suite = unittest.TestLoader().loadTestsFromTestCase(TestEverythingImportant)
    unittest.TextTestRunner(verbosity=1).run(suite)

Possible improvements

It should still be useful to know if there is any testing framework with non-binary tests, like Kathy Van Stone suggested. Probably I won't use it this simple personal project, but it might be useful on future projects.

╭⌒浅淡时光〆 2024-08-10 23:34:51

我不完全确定单元测试是如何工作的,但大多数单元测试框架都有类似于类别的东西。我想您可以对此类测试进行分类,将它们标记为忽略,然后仅在您对它们感兴趣时运行它们。但我从经验中知道,忽略测试很快就会变得……只是忽略了没有人运行的测试,因此编写它们是浪费时间和精力。

我的建议是你的应用程序要做或不做,没有尝试过。

Im not totally sure how unittest works, but most unit testing frameworks have something akin to categories. I suppose you could just categorize such tests, mark them to be ignored, and then run them only when your interested in them. But I know from experience that ignored tests very quickly become...just that ignored tests that nobody ever runs and are therefore a waste of time and energy to write them.

My advice is for your app to do, or do not, there is no try.

以为你会在 2024-08-10 23:34:51

从您链接的 unittest 文档中:

代替unittest.main(),有
其他运行测试的方法
更精细的控制,不太简洁
输出,并且不需要运行
从命令行。例如,
最后两行可以被替换
与:

suite = unittest.TestLoader().loadTestsFromTestCase(TestSequenceFunctions)
unittest.TextTestRunner(verbosity=2).run(suite)

在您的情况下,您可以创建单独的 TestSuite关键和非关键测试的实例。您可以使用命令行参数控制将哪个套件传递给测试运行程序。测试套件还可以包含其他测试套件,因此您可以根据需要创建大的层次结构。

From unittest documentation which you link:

Instead of unittest.main(), there are
other ways to run the tests with a
finer level of control, less terse
output, and no requirement to be run
from the command line. For example,
the last two lines may be replaced
with:

suite = unittest.TestLoader().loadTestsFromTestCase(TestSequenceFunctions)
unittest.TextTestRunner(verbosity=2).run(suite)

In your case, you can create separate TestSuite instances for the criticial and non-critical tests. You could control which suite is passed to the test runner with a command line argument. Test suites can also contain other test suites so you can create big hierarchies if you want.

绿光 2024-08-10 23:34:51

Python 2.7(和 3.1)添加了对跳过某些测试方法或测试用例以及将某些测试标记为预期失败的支持。

http://docs.python.org/library/unittest .html#skipping-tests-and-expected-failures

标记为预期失败的测试不会被计为 TestResult 上的失败。

Python 2.7 (and 3.1) added support for skipping some test methods or test cases, as well as marking some tests as expected failure.

http://docs.python.org/library/unittest.html#skipping-tests-and-expected-failures

Tests marked as expected failure won't be counted as failure on a TestResult.

哆兒滾 2024-08-10 23:34:51

有一些测试系统允许警告而不是失败,但 test_unit 不是其中之一(我不知道哪些系统可以这样做),除非您想扩展它(这是可能的)。

您可以进行测试,以便它们记录警告而不是失败。

处理此问题的另一种方法是分离出测试并仅运行它们来获取通过/失败报告,并且没有任何构建依赖项(这取决于您的构建设置)。

There are some test systems that allow warnings rather than failures, but test_unit is not one of them (I don't know which ones do, offhand) unless you want to extend it (which is possible).

You can make the tests so that they log warnings rather than fail.

Another way to handle this is to separate out the tests and only run them to get the pass/fail reports and not have any build dependencies (this depends on your build setup).

绿萝 2024-08-10 23:34:51

看看鼻子:http://somethingaboutorange.com/mrl/projects/nose /0.11.1/

有很多命令行选项用于选择要运行的测试,并且您可以保留现有的单元测试测试。

Take a look at Nose : http://somethingaboutorange.com/mrl/projects/nose/0.11.1/

There are plenty of command line options for selecting tests to run, and you can keep your existing unittest tests.

晒暮凉 2024-08-10 23:34:51

另一种可能性是创建一个“B”分支(您正在使用某种版本控制,对吧?)并在其中进行“B”的单元测试。这样,你就可以保持发布版本的单元测试干净(看,全是点!),但仍然有 B 的测试。如果你使用像 git 或 Mercurial 这样的现代版本控制系统(我偏向 Mercurial),则分支/克隆和合并是微不足道的操作,所以这就是我的建议。

然而,我认为你正在使用测试来做一些他们不应该做的事情。真正的问题是“‘B’的作用对你来说有多重要?”因为您的测试套件中应该只包含您关心它们是通过还是失败的测试。测试如果失败,则意味着代码已损坏。这就是为什么我建议只在“B”分支中测试“B”,因为那将是您开发“B”功能的分支。

如果您愿意,可以使用记录器或打印命令进行测试。但是,如果您不太关心它是否已损坏,以至于无法在单元测试中对其进行标记,那么我会严重质疑您是否足够关心来测试它。此外,这增加了不必要的复杂性(设置调试级别的额外变量、彼此完全独立但在同一空间内运行的多个测试向量,导致潜在的冲突和错误等)。除非您正在开发“Hello, World!”应用程序,我怀疑你的问题集足够复杂,不会增加额外的、不必要的复杂性。

Another possibility is to create a "B" branch (you ARE using some sort of version control, right?) and have your unit tests for "B" in there. That way, you keep your release version's unit tests clean (Look, all dots!), but still have tests for B. If you're using a modern version control system like git or mercurial (I'm partial to mercurial), branching/cloning and merging are trivial operations, so that's what I'd recommend.

However, I think you're using tests for something they're not meant to do. The real question is "How important to you is it that 'B' works?" Because your test suite should only have tests in it that you care whether they pass or fail. Tests that, if they fail, it means the code is broken. That's why I suggested only testing "B" in the "B" branch, since that would be the branch where you are developing the "B" feature.

You could test using logger or print commands, if you like. But if you don't care enough that it's broken to have it flagged in your unit tests, I'd seriously question whether you care enough to test it at all. Besides, that adds needless complexity (extra variables to set debug level, multiple testing vectors that are completely independent of each other yet operate within the same space, causing potential collisions and errors, etc, etc). Unless you're developing a "Hello, World!" app, I suspect your problem set is complicated enough without adding additional, unnecessary complications.

转身以后 2024-08-10 23:34:51

您可以编写测试,以便他们计算成功率。
使用 OCR,您可以输入 1000 个图像的代码,并要求 95% 成功。

如果您的程序必须与类型 A 一起工作,那么如果失败,则测试失败。如果不需要与 B 一起工作,那么进行这样的测试有什么价值?

You could write your test so that they count success rate.
With OCR you could throw at code 1000 images and require that 95% is successful.

If your program must work with type A then if this fails the test fails. If it's not required to work with B, what is the value of doing such a test ?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文