如何使用有状态Python模块正确实现测试隔离?
我正在从事的项目是一个封装为 Python 包的业务逻辑软件。这个想法是,各种脚本或应用程序将导入它,初始化它,然后使用它。
它当前有一个顶级 init() 方法,可以执行初始化并设置各种内容,一个很好的例子是它设置 SQLAlchemy 具有数据库连接并存储 SA 会话以供以后访问。它存储在我的项目的子包中(即 myproj.model.Session,因此其他代码可以在导入模型后获得工作 SA 会话)。
长话短说,这使我的包裹成为一个有状态的包裹。我正在为项目编写单元测试,这种安全行为带来了一些问题:
- 测试应该被隔离,但我的包的内部状态打破了这种隔离
- 我无法测试主要的 init() 方法,因为它的行为取决于
- 未来测试的 状态需要针对具有众所周知的模型状态的(尚未编写的)控制器部分运行(例如,预先填充的 sqlite 内存数据库)
我是否应该以某种方式重构我的包,因为当前的结构不是最佳(可能)实践(tm)? :)
我应该就这样保留它并每次都设置/拆卸整个过程吗?如果我要实现完全隔离,这意味着在每次测试时都完全擦除并重新填充数据库,这不是矫枉过正吗?
这个问题实际上是关于整体代码&测试结构,但为了它的价值,我使用 nose-1.0我的测试。我知道 隔离插件 可能会帮助我,但我希望在测试套件中做奇怪的事情之前先得到正确的代码。
The project I'm working on is a business logic software wrapped up as a Python package. The idea is that various script or application will import it, initialize it, then use it.
It currently has a top level init() method that does the initialization and sets up various things, a good example is that it sets up SQLAlchemy with a db connection and stores the SA session for later access. It is being stored in a subpackage of my project (namely myproj.model.Session, so other code could get a working SA session after import'ing the model).
Long story short, this makes my package a stateful one. I'm writing unit tests for the project and this stafeful behaviour poses some problems:
- tests should be isolated, but the internal state of my package breaks this isolation
- I cannot test the main init() method since its behavior depends on the state
- future tests will need to be run against the (not yet written) controller part with a well known model state (eg. a pre-populated sqlite in-memory db)
Should I somehow refactor my package because the current structure is not the Best (possible) Practice(tm)? :)
Should I leave it at that and setup/teardown the whole thing every time? If I'm going to achieve complete isolation that'd mean fully erasing and re-populating the db at every single test, isn't that overkill?
This question is really on the overall code & tests structure, but for what it's worth I'm using nose-1.0 for my tests. I know the Isolate plugin could probably help me but I'd like to get the code right before doing strange things in the test suite.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您有几个选择:
模拟数据库
有一些权衡需要注意。
您的测试将变得更加复杂,因为您必须进行连接的设置、拆卸和模拟。您可能还想验证发送的 SQL/命令。它还往往会产生一种奇怪的紧密耦合,这可能会导致您在架构或 SQL 更改时花费额外的时间来维护/更新测试。
这通常是最纯粹的测试隔离,因为它减少了测试的潜在巨大依赖性。它还往往会使测试更快,并减少在持续集成环境中自动化测试套件的开销。
在每次测试时重新创建数据库
需要注意的权衡。
这可能会使您的测试非常慢,具体取决于重新创建数据库实际花费的时间。如果开发数据库服务器是共享资源,则必须进行额外的初始投资以确保每个开发人员在服务器上拥有自己的数据库。服务器可能会受到影响,具体取决于测试运行的频率。在持续集成环境中运行测试套件会产生额外的开销,因为它至少需要,可能更多的数据库(取决于同时构建的分支数量)。
好处与实际运行将在生产中使用的相同代码路径和类似资源有关。这通常有助于尽早发现错误,这总是一件非常好的事情。
ORM DB 交换
如果您使用像 SQLAlchemy 这样的 ORM,您可以将底层数据库与可能更快的内存数据库进行交换。这可以让您减轻前面两个选项的一些负面影响。
它与生产中使用的数据库并不完全相同,但 ORM 应该有助于减轻掩盖错误的风险。通常,设置内存数据库的时间比文件支持的数据库要短得多。它还具有与当前测试运行隔离的好处,因此您不必担心共享资源管理或最终拆卸/清理。
You have a few options:
Mock the database
There are a few trade offs to be aware of.
Your tests will become more complex as you will have to do the setup, teardown and mocking of the connection. You may also want to do verification of the SQL/commands sent. It also tends to create an odd sort of tight coupling which may cause you to spend additonal time maintaining/updating tests when the schema or SQL changes.
This is usually the purest for of test isolation because it reduces a potentially large dependency from testing. It also tends to make tests faster and reduces the overhead to automating the test suite in say a continuous integration environment.
Recreate the DB with each Test
Trade offs to be aware of.
This can make your test very slow depending on how much time it actually takes to recreate your database. If the dev database server is a shared resource there will have to be additional initial investment in making sure each dev has their own db on the server. The server may become impacted depending on how often tests get runs. There is additional overhead to running your test suite in a continuous integration environment because it will need at least, possibly more dbs (depending on how many branches are being built simultaneously).
The benefit has to do with actually running through the same code paths and similar resources that will be used in production. This usually helps to reveal bugs earlier which is always a very good thing.
ORM DB swap
If your using an ORM like SQLAlchemy their is a possibility that you can swap the underlying database with a potentially faster in-memory database. This allows you to mitigate some of the negatives of both the previous options.
It's not quite the same database as will be used in production, but the ORM should help mitigate the risk that obscures a bug. Typically the time to setup an in-memory database is much shorter that one which is file-backed. It also has the benefit of being isolated to the current test run so you don't have to worry about shared resource management or final teardown/cleanup.
在处理一个具有相对昂贵的设置(IPython)的项目时,我看到了一种方法,我们调用 get_ipython 函数,该函数设置并返回一个实例,同时用一个返回的函数替换它自己对现有实例的引用。然后每个测试都可以调用相同的函数,但它只为第一个测试进行设置。
这节省了为每个测试执行漫长的设置过程,但偶尔会产生奇怪的情况,其中测试失败或通过取决于之前运行的测试。我们有办法处理这个问题 - 无论状态如何,许多测试都应该做相同的事情,并且我们可以尝试在某些测试之前重置对象的状态。您可能会发现类似的权衡对您有用。
Working on a project with a relatively expensive setup (IPython), I've seen an approach used where we call a
get_ipython
function, which sets up and returns an instance, while replacing itself with a function which returns a reference to the existing instance. Then every test can call the same function, but it only does the setup for the first one.That saves doing a long setup procedure for every test, but occasionally it creates odd cases where a test fails or passes depending on what tests were run before. We have ways of dealing with that - a lot of the tests should do the same thing regardless of the state, and we can try to reset the object's state before certain tests. You might find a similar trade-off works for you.
Mock 是一个简单而强大的工具,可以实现一些隔离。 Pycon2011 有一个很好的视频,展示了如何使用它。我建议将它与 py.test 一起使用,这样可以减少定义测试所需的代码量,并且仍然非常非常强大。
Mock is a simple and powerfull tool to achieve some isolation. There is a nice video from Pycon2011 which shows how to use it. I recommend to use it together with py.test which reduces the amount of code required to define tests and is still very, very powerfull.