如何正确使用TDD来实现数值方法?
我正在尝试使用测试驱动开发来实现我的信号处理库。但我有一点疑问:假设我正在尝试实现正弦方法(我没有):
编写测试(伪代码)
assertEqual(0, 正弦(0))
编写第一个实现
函数正弦(弧度) 返回0
第二个测试
assertEqual(1, sine(pi))
此时,我应该:
- 实现一个适用于 pi 和其他值的智能代码,还是
- 实现仅适用于 0 和 pi 的最愚蠢的代码?
如果选择第二个选项,什么时候可以跳到第一个选项?我最终必须这样做......
I am trying to use Test Driven Development to implement my signal processing library. But I have a little doubt: Assume I am trying to implement a sine method (I'm not):
Write the test (pseudo-code)
assertEqual(0, sine(0))
Write the first implementation
function sine(radians) return 0
Second test
assertEqual(1, sine(pi))
At this point, should I:
- implement a smart code that will work for pi and other values, or
- implement the dumbest code that will work only for 0 and pi?
If you choose the second option, when can I jump to the first option? I will have to do it eventually...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
此时,我是否应该:
实现在两个简单测试之外工作的真实代码?
实现仅适用于两个简单测试的更愚蠢的代码?
两者都不。我不确定您从哪里得到“一次只编写一个测试”的方法,但这确实是一种缓慢的方法。
重点是编写清晰的测试并使用该清晰的测试来设计您的程序。
因此,编写足够的测试来实际验证正弦函数。两次测试显然是不够的。
对于连续函数,您最终必须提供一个已知良好值的表。为什么要等?
然而,测试连续函数存在一些问题。您不能遵循愚蠢的 TDD 程序。
您无法测试 0 到 2*pi 之间的所有浮点值。您无法测试一些随机值。
对于连续函数,“严格的、不假思索的 TDD”是行不通的。这里的问题是你知道你的正弦函数实现将基于一堆对称性。您必须根据您正在使用的对称规则进行测试。虫子藏在裂缝和角落里。边缘情况和极端情况是实现的一部分,如果您不假思索地遵循 TDD,则无法对其进行测试。
但是,对于连续函数,您必须测试实现的边缘情况和极端情况。
这并不意味着 TDD 已损坏或不充分。它表明,如果不思考自己真正的目标是什么,盲目追求“测试第一”是行不通的。
At this point, should I:
implement real code that will work outside the two simple tests?
implement more dumbest code that will work only for the two simple tests?
Neither. I'm not sure where you got the "write just one test at a time" approach from, but it sure is a slow way to go.
The point is to write clear tests and use that clear testing to design your program.
So, write enough tests to actually validate a sine function. Two tests are clearly inadequate.
In the case of a continuous function, you have to provide a table of known good values eventually. Why wait?
However, testing continuous functions has some problems. You can't follow a dumb TDD procedure.
You can't test all floating-point values between 0 and 2*pi. You can't test a few random values.
In the case of continuous functions, a "strict, unthinking TDD" doesn't work. The issue here is that you know your sine function implementation will be based on a bunch of symmetries. You have to test based on those symmetry rules you're using. Bugs hide in cracks and corners. Edge cases and corner cases are part of the implementation and if you unthinkingly follow TDD you can't test that.
However, for continuous functions, you must test the edge and corner cases of the implementation.
This doesn't mean TDD is broken or inadequate. It says that slavish devotion to a "test first" can't work without some thinking about what you real goal is.
在严格的婴儿步 TDD 中,您可能会实现哑方法来返回绿色,然后重构哑代码中固有的重复(对输入值的测试是测试和代码之间的一种重复)通过产生一个真正的算法。使用这种算法来感受 TDD 的困难在于,你的验收测试实际上就在你身边(表 S. Lott 建议),所以你要一直关注它们。在更典型的 TDD 中,该单元与整体分离得足够多,以至于验收测试不能直接插入其中,因此您不会开始考虑对所有场景进行测试,因为所有场景都不明显。
通常,在一两个案例之后您可能会得到一个真正的算法。 TDD 的重要之处在于它驱动设计,而不是算法。一旦你有足够的案例来满足设计需求,TDD 的价值就会显着下降。然后,测试更多地转化为覆盖极端情况,以确保您的算法在您能想到的所有方面都是正确的。因此,如果您对如何构建算法有信心,那就去做吧。你所说的婴儿学步只有在你不确定的时候才合适。通过采取这样的小步骤,您开始构建代码必须涵盖的范围,即使您的实现实际上尚未实现。但正如我所说,这更适合当您不确定如何构建算法时。
In kind of the strict baby-step TDD, you might implement the dumb method to get back to green, and then refactor the duplication inherent in the dumb code (testing for the input value is a kind of duplication between the test and the code) by producing a real algorithm. The hard part about getting a feel for TDD with such an algorithm is that your acceptance tests are really sitting right next to you (the table S. Lott suggests), so you kind of keep an eye on them the whole time. In more typical TDD, the unit is divorced enough from the whole that the acceptance tests can't just be plugged in right there, so you don't start thinking about testing for all scenarios, because all scenarios are not obvious.
Typically, you might have a real algorithm after one or two cases. The important thing about TDD is that it is driving design, not the algorithm. Once you have enough cases to satisfy the design needs, the value in TDD drops significantly. Then the tests more convert into covering corner cases to ensure your algorithm is correct in all aspects you can think of. So, if you are confident in how to build the algorithm, go for it. The kinds of baby steps you are talking about are only appropriate when you are uncertain. By taking such baby steps you start to build out the boundaries of what your code has to cover, even though your implementation isn't actually real yet. But as I said, that is more for when you are uncertain about how to build the algorithm.
编写验证身份的测试。
对于 sin(x) 示例,请考虑双角公式和半角公式。
打开一本信号处理教科书。找到相关章节并将这些定理/推论中的每一个作为适用于您的函数的测试代码来实现。对于大多数信号处理功能来说,输入和输出必须保持一致。编写测试来验证这些身份,无论这些输入是什么。
然后考虑输入。
(注 1)使其工作、使其正确、使其快速、使其便宜。 - 归因于艾伦·凯
Write tests that verify Identities.
For the sin(x) example, think about double-angle formula and half-angle formula.
Open a signal-processing textbook. Find the relevant chapters and implement every single one of those theorems/corollaries as test code applicable for your function. For most signal-processing functions there are identities that must be uphold for the inputs and the outputs. Write tests that verify those identities, regardless of what those inputs might be.
Then think about the inputs.
(Note 1) Make it work, make it correct, make it fast, make it cheap. - attributed to Alan Kay
我相信当您跳到第一个选项时,您会发现代码中有太多“如果”“只是为了通过测试”。情况还不是这样,只有 0 和 pi 。
您会感觉到代码开始有味道,并且愿意尽快重构它。我不确定这是否是纯 TDD 所说的,但恕我直言,您在重构阶段(测试失败、测试通过、重构周期)执行此操作。我的意思是,除非您失败的测试要求不同的实现。
I believe the step when you jump to the first option is when you see there are too many "ifs" in your code "just to pass the tests". That wouldn't be the case yet, just with 0 and pi.
You'll feel the code is beginning to smell, and will be willing to refactor it asap. I'm not sure if that's what pure TDD says, but IMHO you do it in the refactor phase (test fail, test pass, refactor cycle). I mean, unless your failing tests ask for a different implementation.
请注意,(在 NUnit 中)您也可以
在处理浮点相等时执行此操作。
我记得读过的一条建议是尝试重构实现中的神奇数字。
Note that (in NUnit) you can also do
when you're dealing with floating-point equality.
One piece of advice I remember reading was to try to refactor out the magic numbers from your implementations.
您应该一次性编写所有单元测试(在我看来)。虽然只创建专门涵盖必须测试的内容的测试的想法是正确的,但您的特定规范需要一个有效的
sine()
函数,而不是一个sine( )
适用于 0 和 PI 的函数。找到一个你足够信任的来源(数学家朋友、数学书后面的表格或另一个已经实现了正弦函数的程序)。
我选择了
bash/bc
因为我懒得手动输入所有内容:-)。如果它是一个sine()
函数,我只需运行以下程序并将其粘贴到测试代码中。我还会将此脚本的副本放在其中作为注释,以便在发生变化时我可以重新使用它(例如在本例中超过 20 度时所需的分辨率,或者您想要的 PI 值)使用)。输出:
显然,您需要将这个答案映射到您的实际函数的用途。我的观点是,测试应该充分验证本次迭代中代码的行为。如果此迭代要生成仅适用于 0 和 PI 的
sine()
函数,那就没问题。但在我看来,这将是对迭代的严重浪费。您的函数可能非常复杂,必须通过多次迭代来完成。那么您的方法二是正确的,并且应该在添加额外功能的下一次迭代中更新测试。否则,找到一种方法快速添加本次迭代的所有测试,那么您就不必担心频繁地在真实代码和测试代码之间切换。
You should code up all your unit tests in one hit (in my opinion). While the idea of only creating tests specifically covering what has to be tested is correct, your particular specification calls for a functioning
sine()
function, not asine()
function that works for 0 and PI.Find a source you trust enough (a mathematician friend, tables at the back of a math book or another program that already has the sine function implemented).
I opted for
bash/bc
because I'm too lazy to type it all in by hand :-). If it were asine()
function, I'd just run the following program and paste it into the test code. I'd also put a copy of this script in there as a comment as well so I can re-use it if something changes (such as the desired resolution if more than 20 degrees in this case, or the value of PI you want to use).This outputs:
Obviously you will need to map this answer to what your real function is meant to do. My point is that the test should fully validate the behavior of the code in this iteration. If this iteration was to produce a
sine()
function that only works for 0 and PI, then that's fine. But that would be a serious waste of an iteration in my opinion.It may be that your function is so complex that it must be done over several iterations. Then your approach two is correct and the tests should be updated in the next iteration where you add the extra functionality. Otherwise, find a way to add all the tests for this iteration quickly, then you won't have to worry about switching between real code and test code frequently.
严格遵循 TDD,您可以首先实现最愚蠢且可行的代码。为了跳转到第一个选项(实现真正的代码),请添加更多测试:
如果您实现的测试超出了测试绝对需要的范围,那么您的测试将不会完全覆盖您的实现。例如,如果您仅使用上面的两个测试实现了整个
sin()
函数,则您可能会通过返回三角形函数(几乎看起来像正弦函数)和您的测试来意外“破坏”它将无法检测到错误。对于数字函数,您需要担心的另一件事是“相等”的概念,并且必须处理浮点计算中固有的精度损失。这就是我在阅读标题后认为你的问题。 :)
Strictly following TDD, you can first implement the dumbest code that will work. In order to jump to the first option (to implement the real code), add more tests:
If you implement more than what is absolutely required by your tests, then your tests will not completely cover your implementation. For example, if you implemented the whole
sin()
function with just the two tests above, you could accidentally "break" it by returning a triangle function (that almost looks like a sine function) and your tests would not be able to detect the error.The other thing you will have to worry about for numeric functions is the notion of "equality" and having to deal with the inherent loss of precision in floating point calculations. That's what I thought your question was going to be about after reading just the title. :)
我不知道你使用的是什么语言,但是当我处理数字方法时,我通常会先编写一个像你这样的简单测试,以确保大纲正确,然后我提供更多值来涵盖我怀疑的情况事情可能会出错。在 .NET 中,NUnit 2.5 为此提供了一个很好的功能,称为
[TestCase]
,您可以在其中将多个输入值提供给同一个测试,如下所示:I don't know what language you are using, but when I am dealing with a numeric method, I typically write a simple test like yours first to make sure the outline is correct, and then I feed more values to cover cases where I suspect things might go wrong. In .NET, NUnit 2.5 has a nice feature for this, called
[TestCase]
, where you can feed multiple input values to the same test like this:简短的回答。
您似乎有的另一个问题是您应该编写多少测试。你需要测试,直到恐惧(该功能可能不起作用)变成无聊。因此,一旦您测试了所有有趣的输入输出组合,您就完成了。
Short answer.
Another question you seem to have, is how many tests should you write. You need to test till fear (the function may not work) turns into boredom. So once you've tested for all the interesting input-output combinations, you're done.