如何使用 MSpec 有效测试固定长度的平面文件解析器?
我有这个方法签名:List
ITMData
有 35 个属性。
您将如何有效测试这样的解析器?
问题:
- 我应该加载整个文件(我可以使用 System.IO)吗?
- 我应该将文件中的一行放入字符串常量中吗?
- 我应该测试一行或多行
- 我应该测试 ITMData 的每个属性还是应该测试整个对象?
- 我的测试的命名怎么样?
编辑
我将方法签名更改为 ITMData Parse(string line)
。
测试代码:
[Subject(typeof(ITMFileParser))]
public class When_parsing_from_index_59_to_79
{
private const string Line = ".........";
private static ITMFileParser _parser;
private static ITMData _data;
private Establish context = () => { _parser = new ITMFileParser(); };
private Because of = () => { _data = _parser.Parse(Line); };
private It should_get_fldName = () => _data.FldName.ShouldBeEqualIgnoringCase("HUMMELDUMM");
}
编辑2
我仍然不确定是否应该每个类仅测试一个属性。在我看来,这使我能够为规范提供更多信息,即当我解析从索引 59 到索引 79 的单行时,我得到 fldName。如果我测试一个类中的所有属性,我就会丢失此信息。我是否过度指定了我的测试?
我的测试现在看起来像这样:
[Subject(typeof(ITMFileParser))]
public class When_parsing_single_line_from_ITM_file
{
const string Line = ""
static ITMFileParser _parser;
static ITMData _data;
Establish context = () => { _parser = new ITMFileParser(); };
private Because of = () => { _data = _parser.Parse(Line); };
It should_get_fld??? = () => _data.Fld???.ShouldEqual(???);
It should_get_fld??? = () => _data.Fld???.ShouldEqual(???);
It should_get_fld??? = () => _data.Fld???.ShouldEqual(???);
It should_get_fld??? = () => _data.Fld???.ShouldEqual(???);
It should_get_fld??? = () => _data.Fld???.ShouldEqual(???);
It should_get_fld??? = () => _data.Fld???.ShouldEqual(???);
It should_get_fld??? = () => _data.Fld???.ShouldEqual(???);
...
}
I have this method signature: List<ITMData> Parse(string[] lines)
ITMData
has 35 properties.
How would you effectively test such a parser?
Questions:
- Should I load the whole file (May I use System.IO)?
- Should I put a line from the file into a string constant?
- Should I test one or more lines
- Should I test each property of ITMData or should I test the whole object?
- What about the naming of my test?
EDIT
I changed the method signature to ITMData Parse(string line)
.
Test Code:
[Subject(typeof(ITMFileParser))]
public class When_parsing_from_index_59_to_79
{
private const string Line = ".........";
private static ITMFileParser _parser;
private static ITMData _data;
private Establish context = () => { _parser = new ITMFileParser(); };
private Because of = () => { _data = _parser.Parse(Line); };
private It should_get_fldName = () => _data.FldName.ShouldBeEqualIgnoringCase("HUMMELDUMM");
}
EDIT 2
I am still not sure if I should test only one property per class. In my opinion this allows me to give more information for the specification namely that when I parse a single line from index 59 to index 79 I get fldName. If I test all properties within one class I loss this information. Am I overspecifying my tests?
My Tests now looks like this:
[Subject(typeof(ITMFileParser))]
public class When_parsing_single_line_from_ITM_file
{
const string Line = ""
static ITMFileParser _parser;
static ITMData _data;
Establish context = () => { _parser = new ITMFileParser(); };
private Because of = () => { _data = _parser.Parse(Line); };
It should_get_fld??? = () => _data.Fld???.ShouldEqual(???);
It should_get_fld??? = () => _data.Fld???.ShouldEqual(???);
It should_get_fld??? = () => _data.Fld???.ShouldEqual(???);
It should_get_fld??? = () => _data.Fld???.ShouldEqual(???);
It should_get_fld??? = () => _data.Fld???.ShouldEqual(???);
It should_get_fld??? = () => _data.Fld???.ShouldEqual(???);
It should_get_fld??? = () => _data.Fld???.ShouldEqual(???);
...
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果这样做,它就不再是单元测试——它变成集成或回归测试。如果您希望它显示单元测试不会显示的可能错误,您可以这样做。但这不太可能。
单元测试可能会更好,至少在开始时是这样。
如果您计划编写多个使用同一输入行的测试,那么当然可以。但就我个人而言,我可能倾向于编写一堆不同的测试,每个测试都传递不同的输入字符串。此时,没有太多理由创建常量(除非它是在测试方法内声明的局部常量)。
您没有指定,但我假设您的输出与您的输入是一一对应的——也就是说,如果您传入三个字符串,您将获得三个
ITMData
s回来了。在这种情况下,多线测试的需求将受到限制。测试退化情况几乎总是值得的,在这种情况下将是一个空字符串数组(零行)。并且至少有一个包含多行的测试可能是值得的,这样您就可以确保迭代中没有愚蠢的错误。
但是,如果您的输出与输入是一对一的,那么您确实有另一种方法想要退出 - 您应该有一个
ParseSingleLine
方法。那么您的 Parse 只不过是迭代行并调用 ParseSingleLine 而已。您仍然需要对 Parse 进行一些测试,但大多数测试将集中在ParseSingleLine
上。If you do this, it's no longer a unit test -- it becomes an integration or regression test. You can do this, if you expect it to show possible bugs that a unit test wouldn't. But that's not too likely.
You're probably better off with unit tests, at least to start.
If you plan to write more than one test that uses the same input line, then sure. But personally, I would probably tend to write a bunch of different tests, with each one passing a different input string. At that point, there's not much reason to make a constant (unless it's a local constant, declared inside the test method).
You didn't specify, but I'm going to assume that your output is one-for-one with your input -- that is, if you pass in three strings, you'll get three
ITMData
s returned. In that case, the need for multi-line tests would be limited.It's almost always worth testing the degenerate case, which in this case would be an empty string array (zero lines). And it's probably worth having at least one test that has more than one line, just so you can make sure there are no silly mistakes in your iteration.
However, if your output is one-for-one with your input, then you really have another method wanting to get out -- you should have a
ParseSingleLine
method. Then yourParse
would be nothing more than iterating lines and callingParseSingleLine
. You would still want a handful of tests for Parse, but most of your testing would focus aroundParseSingleLine
.如果我遇到这样的问题,我通常会这样做:
提前声明一个简短的免责声明:我认为我会更多地走“集成测试”或“测试整个解析器”路线,而不是测试单独的行。过去,我不止一次面临过这样的情况:大量实现细节泄漏到我的测试中,并迫使我在更改实现细节时经常更改测试。我猜是过度规范的典型情况;-/
Here's what I would normally do if I'm facing such a problem:
One short disclaimer in advance: I think I would more go down the "integration testing" or "testing the parser as a whole" route rather than testing individual lines. In the past I've more than once faced the situation where lots of implementation details leaked into my tests and forced me to change the tests often when I changed implementation details. Typical case of overspecification I guess ;-/
我通常会尝试考虑常见的成功和失败场景以及边缘情况。需求也有助于设置适当的用例。考虑使用 Pex 来枚举各种场景。
I typically try to consider common success and fail scenarios, along with edge cases. Requirements are also helpful for setting up appropriate use cases. Consider Pex for enumerating various scenarios.
关于您的新问题:
如果您想安全起见,您可能应该至少进行一项测试来检查每个属性是否匹配。
关于这个主题有很多讨论,例如这个。一般规则是,您的单元测试类中将有多个方法,每个方法都旨在测试特定的内容。在您的情况下,可能是这样的:
因此,换句话说,测试您认为解析器“正确”的确切行为。完成此操作后,您在更改解析器代码时会感到更加轻松,因为您将拥有一个全面的测试套件来检查您是否没有破坏任何内容。请记住经常进行实际测试,并在进行更改时保持测试更新!
一般来说,我认为您可以通过谷歌搜索找到大多数问题的答案。还有几本关于测试驱动开发的优秀书籍,它们不仅会阐明 TDD 的如何,还会阐明为什么。如果您对编程语言相对不可知,我会推荐 Kent Beck 的 示例测试驱动开发,否则类似于 Microsoft .NET 中的测试驱动开发。这些应该会让你很快走上正轨。
编辑:
在我看来,是的。具体来说,我不同意你的下一句话:
您究竟以什么方式丢失信息?假设除了每个测试都有一个新类之外,还有两种方法可以进行此测试:
CheckPropertyX
、CheckPropertyY
等。当您运行测试时,您将准确地看到哪些字段通过了,哪些字段失败了。这显然满足了您的要求,尽管我会说这仍然是矫枉过正。我会选择选项 2:Assert.AreEqual("test1", myObject.PropertyX, "Property X was invalidly parsed");
Assert.AreEqual("test2", myObject.PropertyY, "属性 Y 被错误地解析");
当其中一个失败时,您将知道哪一行失败。当您修复了相关错误并重新运行测试时,您将看到是否有任何其他属性失败。这通常是大多数人采用的方法,因为为每个属性创建一个类甚至方法会导致太多的代码,并且需要太多的工作来保持最新。
Regarding your newer questions:
If you want to be on the safe side, you should probably have at least one test which checks that each property was matched.
There are quite a few discussions on this topic, such as this one. The general rule is that you would have multiple methods in your unit test class, each aimed at testing something specific. In your case, it might be things like:
So, in other words, testing for the exact behaviour that you consider "correct" for the parser. Once this is done, you will feel much more at ease when making changes to the parser code, because you will have a comprehensive test suite to check that you didn't break anything. Remember to actually test often, and to keep your tests updated when you make changes! There's a fairly good guide about unit testing and Test Driven Development on MSDN.
In general, I think you can find answers to most of your questions by googling a bit. There are also several excellent books on Test Driven Development, which will drive home not only the how of TDD, but the why. If you are relatively programming language agnostic, I would recommend Kent Beck's Test Driven Development By Example, otherwise something like Test-Driven Development in Microsoft .NET. These should get you on the right track very quickly.
EDIT:
In my opinion, yes. Specifically, I don't agree with your next line:
In what way do you lose information exactly? Let's say there are 2 ways to do this test, other than having a new class per test:
CheckPropertyX
,CheckPropertyY
, etc. When you run your tests, you will see exactly which fields passed and which fields failed. This clearly satisfies your requirements, although I would say it's still overkill. I would go with option 2:Assert.AreEqual("test1", myObject.PropertyX, "Property X was incorrectly parsed");
Assert.AreEqual("test2", myObject.PropertyY, "Property Y was incorrectly parsed");
When one of those fails, you will know which line failed. When you have fixed the relevant error, and re-run your tests, you will see if any other properties have failed. This is generally the approach that most people take, because creating a class or even method per property results in too much code, and too much work to keep up to date.