Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
The community reviewed whether to reopen this question last year and left it closed:
Original close reason(s) were not resolved
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(6)
键值、层次、映射缩减或图形数据库系统更接近于实现策略,它们与物理表示密切相关。选择其中之一的主要原因是,是否有令人信服的性能参数,并且它非常适合您的数据处理策略。请注意,临时查询通常对于这些系统来说并不实用,您最好提前决定您的查询。
关系数据库系统试图将面向业务的逻辑模型与底层物理表示和处理策略分开。这种分离虽然不完美,但仍然相当不错。关系系统非常适合处理事实并从事实集合中提取可靠的信息。关系系统也非常擅长即席查询,而其他系统则在这方面表现不佳。这非常适合商业世界和许多其他地方。这就是关系系统如此流行的原因。
如果是业务应用程序,关系系统几乎总是答案。对于其他系统,这可能就是答案。如果您有更多的数据处理问题,比如需要发生一些事情的管道,并且您有大量数据,并且您预先知道所有查询,那么另一个系统可能适合您。
Key-value, heirarchical, map-reduce, or graph database systems are much closer to implementation strategies, they are heavily tied to the physical representation. The primary reason to choose one of these is if there is a compelling performance argument and it fits your data processing strategy very closely. Beware, ad-hoc queries are usually not practical for these systems, and you're better off deciding on your queries ahead of time.
Relational database systems try to separate the logical, business-oriented model from the underlying physical representation and processing strategies. This separation is imperfect, but still quite good. Relational systems are great for handling facts and extracting reliable information from collections of facts. Relational systems are also great at ad-hoc queries, which the other systems are notoriously bad at. That's a great fit in the business world and many other places. That's why relational systems are so prevalent.
If it's a business application, a relational system is almost always the answer. For other systems, it's probably the answer. If you have more of a data processing problem, like some pipeline of things that need to happen and you have massive amounts of data, and you know all of your queries up front, another system may be right for you.
如果您的数据只是一个事物列表,并且您可以为每个项目派生一个唯一标识符,那么 KVS 是一个很好的匹配。它们是我们在计算机科学新生中学到的简单数据结构的紧密实现,并且不允许复杂的关系。
一个简单的测试:您能否将数据及其所有关系表示为链表或哈希表?如果是,KVS 可能会起作用。如果没有,您需要一个 RDB。
您仍然需要找到适合您的环境的 KVS。对 KVS 的支持,即使是主要的 KVS,也远不及 PostgreSQL 和 MySQL/MariaDB 等。
If your data is simply a list of things and you can derive a unique identifier for each item, then a KVS is a good match. They are close implementations of the simple data structures we learned in freshman computer science and do not allow for complex relationships.
A simple test: can you represent your data and all of its relationships as a linked list or hash table? If yes, a KVS may work. If no, you need an RDB.
You still need to find a KVS that will work in your environment. Support for KVSes, even the major ones, is nowhere near what it is for, say, PostgreSQL and MySQL/MariaDB.
IMO,当底层数据非结构化、不可预测或经常变化时,键值对(例如 NoSQL 数据库)效果最好。如果您没有结构化数据,那么关系数据库的麻烦将超过其价值,因为您将需要进行大量架构更改和/或跳过障碍以使数据符合结构。
KVP / JSON / NoSql 很棒,因为数据结构的改变不需要完全重构数据模型。向数据对象添加字段只需将其添加到数据即可。另一方面,KVP / Nosql 数据库中的约束和验证检查比关系数据库少,因此您的数据可能会变得混乱。
关系数据模型具有性能和节省空间的优点。规范化的关系数据可以使数据的理解和验证变得更加容易,因为有表键关系和约束可以帮助您。
我见过的最糟糕的模式之一就是试图两者兼顾。尝试将键值对放入关系数据库通常会导致灾难。我建议首先使用最适合您数据的技术。
IMO, Key value pair (e.g. NoSQL databases) works best when the underlying data is unstructured, unpredictable, or changing often. If you don't have structured data, a relational database is going to be more trouble than its worth because you will need to make lots of schema changes and/or jump through hoops to conform your data to the structure.
KVP / JSON / NoSql is great because changes to the data structure do not require completely refactoring the data model. Adding a field to your data object is simply a matter of adding it to the data. The other side of the coin is there are fewer constraints and validation checks in a KVP / Nosql database than a relational database so your data might get messy.
There are performance and space saving benefits for relational data models. Normalized relational data can make understanding and validating the data easier because there are table key relationships and constraints to help you out.
One of the worst patterns i've seen is trying to have it both ways. Trying to put a key-value pair into a relational database is often a recipe for disaster. I would recommend using the technology that suits your data foremost.
如果您希望基于键查找值的时间复杂度为 O(1),那么您需要 KV 存储。这意味着,如果您有
k1={foo}、k2={bar}
等形式的数据,即使值较大/嵌套结构,并且需要快速查找,您也需要 KV 存储。即使使用正确的索引,您也无法在关系数据库中实现任意键的 O(1) 查找。有时这被称为“随机查找”。
换言之,如果您只按一列进行查询,则可以使用“主键”来检索其余数据,然后使用该列作为键空间,并将其余数据作为 KV 存储中的值最有效的查找方式。
相反,如果您经常按多个列中的任意列查询数据,也就是说您支持更丰富的数据查询 API,那么您可能需要关系数据库。
If you want O(1) lookups of values based on keys, then you want a KV store. Meaning, if you have data of the form
k1={foo}, k2={bar}
, etc, even when the values are larger/ nested structures, and want fast lookups, you want a KV store.Even with proper indexing, you cannot achieve O(1) lookups in a relational DB for arbitrary keys. Sometimes this is referred to as "random lookups".
Alliteratively stated, if you only ever query by one column, a "primary key" if you will, to retrieve the rest of the data, then using that column as a keyspace and the rest of the data as a value in a KV store is the most efficient way to do lookups.
In contrast, if you often query the data by any of several columns, aka you support a richer query API for the data, then you may want a relational database.
传统的关系数据库在扩展超过某个点时存在问题。该点在哪里取决于您想要做什么。
所有(大多数?)云计算供应商都提供键值数据存储。
但是,如果您的应用程序规模合理且数据结构复杂,那么使用关系数据库获得的支持可以降低您的开发成本。
A traditional relational database has problems scaling beyond a point. Where that point is depends a bit on what you are trying to do.
All (most?) of the suppliers of cloud computing are providing key-value data stores.
However, if you have a reasonably sized application with a complicated data structure, then the support that you get from using a relational database can reduce your development costs.
根据我的经验,如果您甚至问是否使用传统做法还是深奥做法的问题,那就选择传统做法。虽然深奥的实践很性感、具有挑战性且有趣,但 99.999% 的应用程序都需要传统方法。
关于关系与 KV,您应该问的问题是:
由于您没有描述该情况,所以任何人都不可能告诉您为什么不应该使用它。 KV 的“包罗万象”的原因是可扩展性,现在这不是问题。你知道优化的规则吗?
KV 是一种高度优化的可扩展性解决方案,但您的应用程序很可能完全不需要它。
In my experience, if you're even asking the question whether to use traditional vs esoteric practices, then go traditional. While esoteric practices are sexy, challenging, and fun, 99.999% of applications call for a traditional approach.
With regards to relational vs KV, the question you should be asking is:
Since you have not described the scenario, it's impossible for anyone to tell you why you shouldn't use it. The "catch all" reason for KV is scalability, which isn't a problem now. Do you know the rules of optimization?
KV is a highly optimized solution to scalability that will most likely be completely unecessary for your application.