复杂目录结构的单元测试
我正在尝试对必须从磁盘读取大量数据的应用程序使用测试驱动开发。问题是数据在文件系统上组织在一个有点复杂的目录结构中(不是我的错)。我正在测试的方法需要查看多个不同目录中存在大量文件,以便方法完成。
我试图避免的解决方案是在硬盘驱动器上有一个已知的文件夹,其中包含所有数据。这种方法很糟糕有几个原因,其中一个原因是如果我们想在另一台计算机上运行单元测试,我们必须将大量数据复制到它。
我还可以在安装方法中生成虚拟文件,并在拆卸方法中清理它们。这样做的问题是,编写代码来复制现有的目录结构并将大量虚拟文件转储到这些目录中会很痛苦。
我了解如何对文件 I/O 操作进行单元测试,但是如何对这种场景进行单元测试呢?
编辑: 我不需要实际阅读这些文件。应用程序需要分析目录结构并确定其中存在哪些文件。而且这是一个大量的子目录,里面有大量的文件。
I am trying to use test-driven development for an application that has to read a lot of data from disk. The problem is that the data is organized on the filesystem in a somewhat complex directory structure (not my fault). The methods I'm testing will need to see that a large number of files exist in several different directories in order for the methods to complete.
The solution I'm trying to avoid is just having a known folder on the hard drive with all the data in it. This approach sucks for several reasons, one reason being that if we wanted to run the unit tests on another computer, we'd have to copy a large amount of data to it.
I could also generate dummy files in the setup method and clean them up in the teardown method. The problem with this is that it would be a pain to write the code to replicate the existing directory structure and dump lots of dummy files into those directories.
I understand how to unit test file I/O operations, but how do I unit test this kind of scenario?
Edit:
I will not need to actually read the files. The application will need to analyze a directory structure and determine what files exist in it. And this is a large number of subdirectories with a large number of files.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我将定义一组模仿文件系统的接口,例如
IDirectory
和IFile
,然后使用测试双精度 创建内存中目录结构的表示。这将允许您根据自己的喜好对该结构进行单元测试(并改变)。
为此,您还需要使用真正的 BCL 类来实现这些接口的具体实现。
这使您可以独立地改变数据结构和数据访问。
I would define a set of interfaces that mimick the file system, such as
IDirectory
andIFile
, and then use Test Doubles to create a representation of the directory structure in memory.This will allow you to unit test (and vary) that structure to your heart's content.
You will also need concrete implementations that implement those interfaces using the real BCL classes for that purpose.
This lets you vary data structure and data access independently of each other.
这有Python的视角。您可能不使用 Python,但答案或多或少适用于大多数语言。
使用任何外部资源(例如 os 模块)进行单元测试时,您必须模拟外部资源。
问题是“如何模拟 os.walk?” (或 os.listdir 或您正在使用的任何内容。)
编写该函数的模拟版本。例如
os.walk
。每个模拟版本都会返回一个目录和文件列表,以便您可以练习您的应用程序。如何构建这个?
编写一个“数据采集器”,对真实数据进行
os.walk
操作,并创建一个可用于测试的大而旧的平面响应列表。创建模拟目录结构。 “编写代码来复制现有的目录结构会很痛苦”通常不是真的。模拟的目录结构只是一个简单的名称列表。完全没有痛苦。
考虑一下这
就是
setUp
所需的全部内容。tearDown
类似。This has a Python perspective. You may not be working in Python, but the answer more-or-less applies to most languages.
With unit testing with any external resource (e.g. the
os
module) you have to mock out the external resource.The question is "how do mock out
os.walk
?" (oros.listdir
or whatever you're using.)Write a mock version of the function.
os.walk
for example. Each mocked-out version returns a list of directories and files so that you can exercise your application.How to build this?
Write a "data grabber" that does
os.walk
on real data and creates a big-old flat list of responses you can use for testing.Create a mock directory structure. "it would be a pain to write the code to replicate the existing directory structure" isn't usually true. The mocked directory structure is simply a flat list of names. There's no pain at all.
Consider this
That's all that's required for
setUp
.tearDown
is similar.哇,这听起来像野兽。我一直在尝试测试自己。
听起来您问题的主要焦点是“如何设置大量文件,以便我可以测试检查所述文件是否存在的方法?”
您提到了几种可能的解决方案。您说您不想简单地在硬盘驱动器上有一个充满测试数据的文件夹,因为您不想经历将数据复制到另一台计算机的过程,这是可以理解的。
您还提到您可以编写方法来生成虚拟文件,但复制数据结构会很痛苦。
Roy Osherove 在单元测试的艺术中表示,在项目维护和版本化的同时维护和版本化测试代码是一个好主意。
我认为为了保持一致性,创建一些虚拟数据并将其与测试代码一起放置在某种源代码控制存储库中是有意义的。这样,您就可以简化将虚拟数据复制到另一台计算机的过程,而不必担心跟踪哪个虚拟数据位于哪台计算机上。那会很痛苦!
我的解决方案:放置虚拟数据是源代码控制。
Whew, that sounds like a beast. I've been dabbling in testing myself.
It sounds like the main focus of your question is "How do I set up a large number of files so that I can test methods that check that said files exist?"
You mention several possible solutions. You said that you don't want to simply have a folder on the hard drive full of test data because you wouldn't want to have to go through the process of copying the data to another computer, which is understandable.
You also mention that you could write methods to generate dummy files, but it would be a pain to replicate the data structure.
Roy Osherove says in The Art of Unit Testing that it's a great idea to maintain and version your test code as your project is maintained and versioned.
I think that for the sake of consistency, it would make sense to create some dummy data and place it in some kind of source control repository with your test code. That way, you could streamline the process of copying the dummy data onto another computer and not have to worry about keeping track of which dummy data is on which machine. That would be a pain!
My solution: place dummy data is source control.
一种可能的解决方案是从安装方法部署的 tar 文件创建虚拟文件和目录结构。
A possible solution would be to create the dummy file&directory structure from a tar file that your setup method deploys.