这里有并发问题吗?开发过程中如何测试?
场景:存在“n”个团队,每个团队都在自己的虚拟“墙”上工作(如 Facebook 的墙)。每个团队只能看到自己的墙和上面的帖子。帖子可以由帖子作者或其他团队成员编辑(如果已配置)。假设确实如此,因为它是必须的)。
设计/技术决策:使用 Restlet+ Glassfish/Java + Mysql 的 RESTful Web 应用程序(编辑:使用 Apache DBUtils 进行数据库访问。没有 ORM - 似乎有点过分)
问题:多个团队登录 T1、T2 和 T3(比如说),每个团队都有一些成员数量。团队级别的数据访问存在并发性,但跨团队不存在并发性,即不同的团队访问不相交的数据集。为了优化对数据库的频繁读/写,我们正在考虑使用 TeamGateway 来控制对数据库的访问以处理并发性。 Web 服务器将缓存团队检索到的数据,以加快读取速度(也有助于更新墙贴列表)
- Q1:是否需要这个(每个团队的 TableGateway + 缓存)?如果不是,您建议如何处理?
- Q2:如果是这样,TableGateway(对于每个团队)是否需要编码为线程安全(同步方法)?假设我们有一个类/注册表 TableGatewayFinder,它具有一个静态方法,该方法返回用于该特定团队的 TableGateway(使用哈希图)。
如果 T1 - T3 中每个人有 6 个人登录,那么只会创建 3 个 TableGateways,并且它是否有助于捕获并发写入(提交前简单的时间戳比较或“冲突标记”追加)并有效管理缓存(我们计划让实体的身份映射 - 需要跟踪 4-5 个不同的实体用于组合层次结构,另一个实体与 4 个实体中的每一个相关联。
一个单位将如何测试网关(基于 TDD 或事后测试)?
提前致谢!
Scenario: There exists 'n' teams who each work on their virtual 'wall' (like facebook's wall). Each team sees only their own wall and the posts on it. The posts can be edited by the author of the post or another team member (if so configured. Assuming this is indeed the case since it's a must have).
Design/technology decisions: RESTful web-app using Restlet+ Glassfish/Java + Mysql (EDIT: Using Apache DBUtils for DB access. No ORM - seemed an overkill)
Question: Multiple teams log on T1, T2 and T3 (say) each with some number of members. There is concurrency at the team-level data access, but not across teams - i.e., different teams access disjoint data sets. To optimize frequent read/writes from/to the DB we are considering a TeamGateway that controls access to DB for handling concurrency. The web-server would cache the data retrieved by the teams to speed up reads (and also to help updating the list of wall posts)
- Q1: Is this (TableGateway per team + cache) even required? If not how do you suggest it be handled?
- Q2: If so, does the TableGateway (for each team) need to be coded as thread safe (synchronized method)?? Let's say we have a class/registry TableGatewayFinder with a static method that returns the TableGateway to use for that particular team (using a hashmap).
If 6 people from each of T1 - T3 log on then would ONLY 3 TableGateways be created and would it help catch concurrent writes (simple timestamp comparison before committing or a "conflict-flagged" append) and effectively manage the caching (We plan on having identity maps for the entities - there are 4-5 different entities that need to be tracked. 4 entities for a composition hierarchy and another one is associated to each of the 4)?
How would one unit test the gateway (TDD based or after the fact)?
Thanks in advance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您只是写入数据库或数据库顶部的缓存解决方案(例如 Spring+Hibernate+EhCache 等),则无需担心损坏表等。即,从低级别来看没有并发问题的观点。
如果你想自己写一个缓存,自己处理并发问题,那就需要付出一些努力。如果您对缓存进行分片并在每个分区上有一个“全局锁”(即在公共互斥锁上同步),并为任何访问获取此锁,那么这将起作用,尽管这不是最高效的方法做吧。但做全局锁以外的事情会涉及相当多的工作。
虽然这是微不足道的,但不确定为什么你想要使用身份哈希映射...我想不出你想要这样做的任何特定原因(如果你正在考虑性能,那么普通哈希映射的性能在这种情况下,这将是您最不需要担心的事情!)。
如果您的实体是文章,那么您可能会遇到另一种形式的并发问题。就像通过 SVN、Mercurial 等版本控制软件解决的问题一样。即,如果您没有将合并功能添加到您的应用程序中,如果有人编辑某人的文章却发现其他人“提交”了另一篇文章,这就会变得很烦恼在您之前进行编辑等。是否需要添加此类功能将取决于用例。
至于测试你的应用程序。对于并发来说,单元测试还不错。通过编写并发单元测试,可以更容易地发现并发错误。编写并发测试非常困难,因此我建议您在编写并发测试之前先阅读《Java 并发实践》等好书。当很难猜测究竟发生了什么时,最好在集成测试之前捕获并发错误!
更新:
@Nupul:这是一个很难回答的问题。然而,如果只有 18 个人输入内容,我敢打赌每次都写入数据库就可以了。
如果您不在其他地方存储任何状态(即仅在数据库中),则应该摆脱任何不必要的互斥体(并且您不应该在数据库以外的任何地方存储任何状态,除非您有充分的理由在您的情况下这样做国际海事组织)。
在执行诸如网络操作之类的操作时很容易犯错误并获取互斥锁,从而导致极端的可用性问题(例如应用程序在很多秒内没有响应等)。而且也很容易出现令人讨厌的并发错误,例如线程死锁等。
因此我的建议是保留您的应用程序。无状态,每次只写入数据库。如果您发现数据库访问导致任何性能问题,那么转向 EhCache 等缓存解决方案将是最好的选择。
除非你想从项目中学习或者必须交付一个应用程序。由于性能要求极高,我认为编写自己的缓存层是不合理的。
If you just write to the DB or to a cache solution on top the DB (e.g. Spring+Hibernate+EhCache etc.), you don't need to worry about corrupting your tables etc. I.e. no concurrency issue from a low-level point of view.
If you want to write a cache yourself and deal with concurrency issues yourself, then that would involve some effort. If you shard your cache and have a "global lock" (i.e.
synchronized
on a common mutex) per partition, and acquire this lock for any access then that would work, while it's not the most performant way to do it. But doing something else than a global lock would involve quite a lot of work.While this is trivial, not sure why you'd want to use a identity hash map... I can't think of any particular reason you want to do that (if you are thinking about performance, then performance of a normal hash map would be the last thing you need to be worried about in this situation!).
If your entities are articles, then you probably have another form of concurrency issue. Like the one that is solved by version controlling software like SVN, Mercurial etc. I.e. if you don't put merging capability to your app., it becomes an annoyance if somebody edits somebody's article only to find that somebody else has "committed" another edit before you etc. Whether you need to add such capability would depend on the use case.
As for testing your app. for concurrency, unit testing is not bad. By writing concurrent unit-tests, it is much more easy to catch concurrency bugs. Writing concurrent tests is very tough, so I recommend that you go through good books like "Java Concurrency in Practice" before writing them. Better catch your concurrency bugs before integration testing when it becomes hard to guess what the hell is going on!
UPDATE:
@Nupul: That's a difficult question to answer. However,if you just have 18 humans typing stuff, my bet is writing every time to the DB would be just fine.
If you don't store any state elsewhere (i.e. only in the DB), you should get rid of any unnecessary mutex (and you should not store any state anywhere else than the DB unless you have very good reason to do so in your situation IMO).
It's easy to make a mistake and acquire a mutex while doing something like a network operation and hence cause extreme usability issues (e.g. app does not respond for many seconds etc.). And it's also easy to have nasty concurrency bugs like thread dead-locks etc.
So my recommendation would be to keep your app. stateless and just write to the DB every time. Should you find any performance issues due to DB access, then turning to cache solutions like EhCache would be the best bet.
Unless you want to learn from the project or have to deliver an app. with extreme performance requirement, I don't think writing your own cache layer will be justified.
单元测试可能不是解决并发问题的最佳方法。相反,您可以尝试基于 Web 的性能工具(例如 JMeter 或 Rational Performance Tester)来测试它的性能以及随着用户数量的增加您是否获得了有效的墙内容。您可以使用这些工具为每个用户提供不同的发帖行为。
Unit testing might not be the best approach for concurrency issues. Instead, you could try a web-based performance tool such as JMeter or Rational Performance Tester to test how it performs and that you've got valid wall contents as you ramp up the number of users. You can give each user different posting behaviour with these tools.
重点关注标题中的2个问题。
是的。明显地。避免并发问题的唯一方法是使产品成为单线程,这将带来重大的性能和可用性问题。
这是一个很难回答的问题。
显然你需要单元测试。但经典的单元测试并不能真正解决并发问题,因为棘手的并发错误往往很少出现。
更好的方法是负载测试。正如@Gnat 所描述的。
但为了获得最佳结果,您需要承认测试不是(完整)解决方案,并添加以下人员:
Focussing on the 2 questions in the title.
Yes. Obviously. The only way to avoid the possibility of concurrency problems would be to make the product single-threaded, and that would have major performance and usability issues.
That's a hard question.
Obviously you need unit tests. But classic unit tests don't really cut it for concurrency issues, because tricky concurrency bugs tend to show up rarely.
A better approach is load testing. as described by @Gnat.
But for best results, you need acknowledge that testing is not the (whole) solution, and add the following: