基于文档的数据库与关系数据库的优缺点

发布于 2024-07-09 21:35:32 字数 393 浏览 15 评论 0原文

我一直在尝试看看是否可以使用基于文档的数据库（在本例中为 CouchDB）来满足一些要求。两个通用要求：

具有某些字段的实体的 CRUD，这些字段在
电子商务 Web 应用程序（如 eBay）上具有唯一索引 (这里有更好的描述）。

我开始认为基于文档的数据库并不是满足这些要求的最佳选择。此外，我无法想象基于文档的数据库的用途（也许我的想象力太有限）。

当我尝试使用面向文档的数据库来满足这些要求时，您能否向我解释一下，如果我在向榆树求梨？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

你丑哭了我 2024-07-16 21:35:32

您需要考虑如何以面向文档的方式处理应用程序。如果您只是尝试复制如何在 RDBMS 中对问题进行建模，那么您将会失败。您可能还需要做出不同的权衡。（[编辑：不确定这与争论有何关系，但是：]请记住，CouchDB 的设计假设您将拥有一个由许多节点组成的活动集群，这些节点可能随时发生故障。您的应用程序将如何处理从数据库中消失的一个数据库节点）思考

它的一种方法是想象你没有任何计算机，只有纸质文档。您将如何利用传递的纸张创建高效的业务流程？如何避免瓶颈？如果事情不顺利怎么办？

你应该考虑的另一个角度是最终一致性，你最终会进入一致的状态，但在一段时间内你可能会不一致。这在 RDBMS 领域是令人厌恶的，但在现实世界中却极为常见。规范的交易示例是从银行账户转账。这在现实世界中实际上是如何发生的——通过单个原子交易或通过不同的银行相互发出贷记和借记通知？当您写支票时会发生什么？

因此，让我们看一下您的示例：

实体的 CRUD，其中某些字段具有唯一索引。

如果我对 CouchDB 术语的理解正确，您想要一个文档集合，其中某些命名值保证在所有这些文档中是唯一的？这种情况通常不受支持，因为文档可能是在不同的副本上创建的。

所以我们需要看看现实世界的问题，看看我们是否可以对其进行建模。您真的需要它们是独一无二的吗？您的应用程序可以处理具有相同值的多个文档吗？您需要分配唯一的标识符吗？你能确定地做到这一点吗？需要这样做的常见场景是您需要唯一的顺序标识符。在复制环境中这很难解决。事实上，如果要求唯一 id 相对于创建时间严格顺序，那么您不可能立即需要该 id。您需要至少放松其中一项限制。

像 eBay 这样的电子商务网络应用程序，

我不知道要在这里添加什么，因为您对该帖子的最后评论是说“非常有用！谢谢”。那里概述的方法是否缺少某些内容，仍然给您带来问题？我认为 MrKurt 的答案相当完整，我添加了一些增强功能来减少争用。

You need to think of how you approach the application in a document oriented way. If you simply try to replicate how you would model the problem in an RDBMS then you will fail. There are also different trade-offs that you might want to make. ([ed: not sure how this ties into the argument but:] Remember that CouchDB's design assumes you will have an active cluster of many nodes that could fail at any time. How is your app going to handle one of the database nodes disappearing from under it?)

One way to think about it is to imagine you didn't have any computers, just paper documents. How would you create an efficient business process using bits of paper being passed around? How can you avoid bottlenecks? What if something goes wrong?

Another angle you should think about is eventual consistency, where you will get into a consistent state eventually, but you may be inconsistent for some period of time. This is anathema in RDBMS land, but extremely common in the real world. The canonical transaction example is of transferring money from bank accounts. How does this actually happen in the real world - through a single atomic transactions or through different banks issuing credit and debit notices to each other? What happens when you write a cheque?

So lets look at your examples:

CRUD of entities with some fields with unique index on it.

If I understand this correctly in CouchDB terms, you want to have a collection of documents where some named value is guaranteed to be unique across all those documents? That case isn't generally supportable because documents may be created on different replicas.

So we need to look at the real world problem and see if we can model that. Do you really need them to be unique? Can your application handle multiple docs with the same value? Do you need to assign a unique identifier? Can you do that deterministically? A common scenario where this is required is where you need a unique sequential identifier. This is tough to solve in a replicated environment. In fact if the unique id is required to be strictly sequential with respect to time created it's impossible if you need the id straight away. You need to relax at least one of those constraints.

ecommerce web app like ebay

I'm not sure what to add here as the last comment you made on that post was to say "very useful! thanks". Was there something missing from the approach outlined there that is still causing you a problem? I thought MrKurt's answer was pretty full and I added a little enhancement that would reduce contention.

回复收藏 0 原文

花伊自在美 2024-07-16 21:35:32

是否需要对数据进行标准化？

是：使用关系。
否：使用文档。

回复收藏 0 原文

旧情勿念 2024-07-16 21:35:32

我也是同样的情况，我现在很喜欢couchdb，我认为整个功能风格很棒。但我们到底什么时候开始在 ernest 中使用它们进行应用呢？我的意思是，是的，我们都可以非常快速地开始开发应用程序，不受所有那些关于正常形式被留在路边而不使用模式的令人讨厌的困扰。但是，套用一句话“我们站在巨人的肩膀上”。使用 RDBMS 以及规范化和使用模式是有充分理由的。我的老甲骨文头脑正在思考没有形式的数据。

我对 couchdb 的主要惊叹因素是复制内容和协同工作的版本控制系统。

上个月我一直在绞尽脑汁试图摸索couchdb的存储机制，显然它使用B树但不基于正常形式存储数据。这是否意味着它真的非常聪明，并且意识到数据位是被复制的，所以我们只需创建一个指向此 B 树条目的指针？

到目前为止，我正在考虑流式传输到 base64 字符串的 xml 文档、配置文件、资源文件。

但我会使用 couchdb 来存储结构数据吗？我不知道，任何帮助都非常感谢。

对于存储 RDF 数据甚至自由格式文本可能很有用。

回复收藏 0 原文