JUnit 使用嵌入式服务器测试 Cassandra
在我们的例子中,为将数据保存到 nosql 数据存储的代码编写单元测试的最佳方法是什么?
=>我们正在使用嵌入式服务器方法,使用 git hub 中的实用程序(https://github.com/hector-client/hector/blob/master/test/src/main/java/me/prettyprint/hector/testutils/EmbeddedServerHelper.java)。但是我已经看到了一些问题。 1)它在多个测试用例中保存数据,这使得我们很难确保测试类的测试用例中的数据是不同的。我尝试在每个测试用例之后调用 cleanUp @After,但这似乎并没有清理数据。 2)当我们添加更多测试时,我们的内存不足,这可能是因为 1,但我还不确定。我目前有 1G 堆大小来运行我的构建。
=>我一直在考虑的另一种方法是模拟 cassandra 存储。但这可能会泄露 cassandra 模式中的一些问题,因为我们经常发现上述方法会捕获数据存储到 cassandra 中的方式的问题。
请让我知道您对此的想法,以及是否有人使用过 EmbeddedServerHelper 并且熟悉我提到的问题。
只是一个更新。通过将测试嵌入式服务器使用的 cassandra.yaml 中的 in_memory_compaction_limit_in_mb 参数更改为 32,我能够解决 2) 运行构建时 Java 堆空间不足的问题。下面的链接帮助我 http:// www.datastax.com/docs/0.7/configuration/storage_configuration#in-memory-compaction-limit-in-mb。它是 64,并且在压缩过程中开始持续失败。
What is the best approach to write unit tests for code that persists data to nosql data store, in our case cassandra?
=> We are using embedded server approach using a utility from git hub (https://github.com/hector-client/hector/blob/master/test/src/main/java/me/prettyprint/hector/testutils/EmbeddedServerHelper.java). However I have been seeing some issues with this. 1) It persists data across multiple test cases making it hard for us to make sure data is different in test cases of a test class. I tried calling cleanUp @After each test case, but that doesn't seem to cleanup data. 2) We are running out of memory as we add more tests and this could be because of 1, but I am not sure yet on that. I currently have 1G heap size to run my build.
=> The other approach I have been thinking is to mock the cassandra storage. But that might leak some issues in the cassandra schema as we often found the above approach catching issues with the way data is stored into cassandra.
Please let me know you thoughts on this and if anyone has used EmbeddedServerHelper and are familiar with the issues I have mentioned.
Just an update. I was able to resolve 2) running out of java heap space issue when running builds by changing the in_memory_compaction_limit_in_mb parameter to 32 in the cassandra.yaml used by the test embedded server. The below link helped me http://www.datastax.com/docs/0.7/configuration/storage_configuration#in-memory-compaction-limit-in-mb. It was 64 and started to fail consistently during compaction.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我们使用嵌入式 cassandra 服务器,我认为这是测试 cassandra 时最好的方法,模拟 cassandra API 太容易出错。
EmbeddedServerHelper.cleanup()
只是从文件系统中删除文件,但数据可能仍然存在于内存中。在
EmbeddedServerHelper
中有一个teardown()
方法,但我不确定它的效果如何,因为 cassandra 有很多静态单例,其状态不会被 < code>teardown()我们所做的是我们有一个调用 在测试之间对每个列族进行截断。这将删除所有数据。
We use an embedded cassandra server, and I think that is the best approach when testing cassandra, mocking the cassandra API is too error prone.
EmbeddedServerHelper.cleanup()
just removes files rom the file system, but data may still exist in memory.There is a
teardown()
method inEmbeddedServerHelper
, but I a not sure how effective that is, as cassandra has a lot of static singletons whose state is not cleaned up byteardown()
What we do is we have a method that calls truncate on each column family between tests. That will remove all data.
我想你可以看看 cassandra-unit : https://github.com/jsevellec/cassandra -单位/维基
I think you can take a look at cassandra-unit : https://github.com/jsevellec/cassandra-unit/wiki
我使用 Mojo Cassandra maven 插件。
这是一个示例插件配置,我用它来启动 Cassandra 服务器以供我的单元测试使用:
我确实设法让 Hector 的嵌入式服务器帮助程序类正常工作,这非常有用,但是由于 此错误。
I use the Mojo Cassandra maven plugin.
Here's an example plugin configuration that I use to spin up a Cassandra server for use by my unit tests:
I did manage to get Hector's embedded server helper class working which can be very useful, however I ran into classloader conflicts due to this bug.
您无法在一台虚拟机内重新启动 Cassandra 实例 - 由于它们正在使用单例,Cassandra 具有“每次终止时关闭策略”。
您也不需要重新启动 Casandra,只需删除所有列族 (CF) 即可。为了删除CF,您需要首先刷新数据,压缩它,然后最后您可以删除它。
此代码将连接到嵌入式 Cassandra 并执行所需的清理:
现在执行 CLI drop CF 脚本:
script.txt 可能包含:
You cannot restart Cassandra instance within one VM - Cassandra has "shutdown per kill policy" due to singeltons that they are using.
You also do not need to restart Casandra, just only remove all column families (CFs). In order to remove CF you need first to flush data, compact it and after that finally you can drop it.
This code will connect to embedded Cassandra and execute required cleaup:
Now execute CLI drop CF script:
and script.txt could have:
“似乎没有清理数据”到底是什么意思?您仍然可以在数据库中看到您的数据吗?
该问题可能是由于 Cassandra 不会立即删除“值”,而是仅在经过
gc_grace_seconds
秒后(通常默认为 10 天)才删除。 Cassandra 标记要删除的值。By "doesn't seem to clean up data" what exactly do you mean? That you still see your data in the database?
That problem might be due to Cassandra that doesn't delete the "values" instantly, but only after the
gc_grace_seconds
seconds are passed (that usually defaults to 10 days). Cassandra marks the values to be deleted.除了已发布的内容之外,在某些情况下您还想测试错误处理 - 当 Cassandra 查询失败时您的应用程序如何表现。
有一些库可以帮助您解决此问题:
我是 cassandra-spy 的作者,并写信给它帮助我测试这些案例。
In addition to what's been posted, there are cases when you want to test error handling - how does your app behave when a Cassandra query fails.
There are a few libraries that can help you with this:
I'm the author of cassandra-spy and wrote to it help me test these cases.