在不稳定的网络中保持分布式数据库同步
我面临以下挑战:
我在不同的地理位置有一堆数据库,这些位置的网络可能会经常失败(我正在使用蜂窝网络)。我需要保持所有数据库同步,但不需要实时。我使用的是 Java,但我可以自由选择任何免费数据库。
我怎样才能实现这个目标?
I'm facing the following challenge:
I have a bunch of databases in different geographical locations where the network may fail a lot (I'm using cellular network). I need to keep all the databases synchronized but there is no need to be in real time. I'm using Java but I have the freedom to choose any free database.
How can I achieve this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这是一个相当成熟的研究库(人们显然没有意识到)的问题。如果不是绝对必要的话,我建议不要重新发明一个糟糕的、有缺陷的轮子(例如,允许一个微不足道的解决方案的不寻常的要求)。
一些关键字:复制、移动 DBMS, 分布式、断开连接的 DBMS。
这些研究论文也是相关的(作为该研究领域的示例):
...等等。
It's a problem with a quite established corpus of research (of which people is apparently unaware). I suggest to not reinvent a poor, defective wheel if not absolutely necessary (such as, for example, so unusual requirements to allow a trivial solution).
Some keywords: replication, mobile DBMSs, distributed disconnected DBMSs.
Also these research papers are relevant (as an example of this research field):
... and so on.
我不知道有任何数据库可以为您提供开箱即用的功能;由于需要最终一致性和冲突解决,这里存在很多复杂性(例如,如果网络被分成两半,并且您将某些内容更新为值 123,而我将另一半更新为 321,会发生什么情况,然后网络重新连接?)
您可能需要自己动手。
有关如何做到这一点的一些想法,请查看雅虎 PNUTS 系统的设计:http://research.yahoo .com/node/2304 和亚马逊的 Dynamo:http://www.allthingsdistributed .com/2007/10/amazons_dynamo.html
I am not aware of any databases that will give you this functionality out of the box; there is a lot of complexity here due to the need for eventual consistency and conflict resolution (eg, what happens if the network gets split into 2 halves, and you update something to the value 123 while I update it on the other half to 321, and then the networks reconnect?)
You may have to roll your own.
For some ideas on how to do this, check out the design of Yahoo's PNUTS system: http://research.yahoo.com/node/2304 and Amazon's Dynamo: http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
查看 SymmetricDS。 SymmetricDS 是一款支持网络、独立于数据库的数据同步/复制软件。它使用网络和数据库技术在关系数据库之间近乎实时地复制表。该软件旨在扩展大量数据库、跨低带宽连接工作并承受网络中断。
Check out SymmetricDS. SymmetricDS is web-enabled, database independent, data synchronization/replication software. It uses web and database technologies to replicate tables between relational databases in near real time. The software was designed to scale for a large number of databases, work across low-bandwidth connections, and withstand periods of network outage.
我不知道您的要求或您的应用程序,但这不是一个快速回答类型的问题。我很有兴趣看看其他人怎么说。不过,我有一个建议,可能适合你,也可能不适合你,具体取决于你的要求和情况。特别是,如果您的用户即使在网络不可用(离线访问)时也需要使用该应用程序,这将无济于事。
保持一堆小型数据库同步是一项相当复杂的任务,要正确完成。是否有可能只拥有一个集中式数据库,并且让客户端应用程序直接连接到它,或者(我的首选解决方案)编写一些 Web 服务来处理访问/更新数据,而不是拥有一堆客户端数据库?
我意识到这限制了离线访问,但是您可以使用多种缓存策略。 (当然,这会让你回到原来的问题。)
I don't know your requirements or your apps, but this isn't a quick answer type of question. I'm very interested to see what others have to say. However, I have a suggestion that may or may not work for you, depending on your requirements and situation. particularly, this will not help if your users need to use the app even when the network is unavailable (offline access).
Keeping a bunch of small databases synchronized is a fairly complex task to do correctly. Is there any possibility of just having one centralized database, and either having the client applications connect directly to it or (my preferred solution) write some web services to handle accessing/updating data rather than having a bunch of client databases?
I realize this limits offline access, but there are various caching strategies you can use. (Which of course, leads you back to your original question.)