基于键值的数据库,有人可以向我解释如何实际使用它们吗?
似乎对基于键/值的数据库有很大的推动力,我相信 memcache 也是如此。
该值通常是某种可以保存更有意义的数据的集合或 xml 文件吗?
如果是,那么反序列化数据通常比传统的联接和选择返回基于行的结果集的表更快吗?
There seems to be a big push for key/value based databases, which I believe memcache to be.
Is the value usually some sort of collection or xml file that would hold more meaningfull data?
If yes, is it generally faster to deserialize data then to do traditinally JOINS and selects on tables that return a row based result set?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
所发生的情况是,一些非常非常大的网站,例如谷歌和亚马逊,占据了一个很小的利基市场,他们的数据存储和检索要求与其他人都非常不同否则,需要一种存储/检索数据的新方法。 我确信这些人知道他们在做什么,他们非常擅长他们所做的事情。
然而,随后这些信息被获取并报告,并被歪曲为“关系数据库无法处理网络数据”。 此外,读者开始思考“嘿,如果关系数据库对亚马逊和谷歌来说不够好,那么它们对我来说也不够好。”
这些推论都是错误的:99.9% 的数据库(包括网站背后的数据库)与亚马逊和谷歌不在同一水平上——不在几个数量级之内。 对于这 99.9% 来说,没有任何改变,关系数据库仍然工作得很好。
What has happened is that some really, really, REALLY big web sites like Google and Amazon occupy a teeny, tiny niche where their data storage and retrieval requirements are so different to anyone else's that a new way of storing/retrieving data is called for. I'm sure these guys know what they are doing, they are very good at what they do.
However, then this gets picked up and reported on and distorted into "relational databases aren't up to handling data for the web". Also, readers start to think "hey, if relational databases aren't good enough for Amazon and Google, they aren't good enough for me."
These inferences are both wrong: 99.9% of all databases (including those behind web sites) are not in the same ball park as Amazon and Google - not within several orders of magnitude. For this 99.9%, nothing has changed, relational databases still work just fine.
与大多数事情一样,“视情况而定”。 如果联接相对无关紧要(即,对关键数据进行少量联接),并且您要存储特别复杂的数据,则最好坚持使用更复杂的查询。
这也是一个新鲜度的问题。 在许多情况下,许多联接的目的是将非常不同的数据组合在一起; 也就是说,数据的相对新鲜度差异很大。 当更新大量对中的一小部分数据时,它会增加相当大的复杂性和开销来保持键值对表同步。 系统复杂性通常可以被视为性能成本的一种形式; 在不影响性能的情况下对复杂系统进行更改所需的时间、风险和成本通常远远大于简单系统。
最好的解决方案始终是尽可能简单地编写有效的代码。 在大多数情况下,我会说这意味着创建一个完全规范化的数据库设计并加入其中的垃圾。 仅在性能成为明显问题之后才重新审视您的设计。 当您分析问题时,问题所在以及需要采取哪些措施来解决这些问题也会很明显。 如果它减少连接,那就这样吧。 当你需要知道的时候你就会知道。
As with most things, "it depends". If the joins are relatively inconsequential (that is, a small number of joins on well-keyed data), and you are storing especially complex data, it may be better just to stick with the more complex query.
It's also a matter of freshness. In many cases the purpose of many joins is to bring together very disparate data; that is, data which varies widely in its relative freshness. It can add considerable complexity and overhead to keep a key-value pair table synchronized when a small slice of the data across a large number of pairs is updated. System complexity can often be considered a form of performance cost; the time, risk and cost to make a change to a complex system without impacting performance is often far greater than a simple one.
The best solution is always to code what works as simply as you can. In most cases I'd say this means create a fully normalized database design and join the crap out of it. Only revisit your design after performance becomes an obvious problem. When you analyze the issue, it will also be obvious where the problems lie and what needs to be done to fix them. If it's reducing joins, then so be it. You'll know when you need to know.
我在键/值数据库方面没有太多经验,所以请对我所说的持保留态度。
话虽如此,我首先要指出的是,memcached 不是一个键/值数据库。 数据库意味着某种持久存储,而 memcached 则不然。 Memcached 旨在成为一个临时存储,用于将查询保存到实际数据库中。
除此之外,我的理解是您无法用键/值数据库替换 RDBMS。 它们往往最适合非结构化数据或您可能不知道需要存储的所有属性的其他数据。 如果您需要存储高度结构化的数据,那么传统的 RDBMS 是最好的选择。
I don't have a lot of experience with key/value dbs, so take what I say with a grain of salt.
With that said, the first thing I should point out is that memcached isn't a key/value database. A database implies some kind of persistent store, which memcached isn't. Memcached is intended to be a temporary store to save a query to the actual database.
Other than that, my understanding is that you're not going to be able to replace your RDBMS with a key/value database. They tend to be best for unstructured data or other data where you may not know all the attributes that need to be stored. If you need to store highly-structured data, you can't do much better than a traditional RDBMS.
它们可以是需要反序列化的复杂结构化数据。 它们也可以是简单的固定大小记录,就像 RDBMS 一样。 部分好处是您可以自己做出决定。 当您优化数据库时,您不会受限于 SQL 的功能。
您询问的方式听起来像是联接或反序列化永远是瓶颈。 但在任何数据库中,事情都不会那么简单。 如果您确实愿意,也可以将非规范化数据放入 RDBMS 中,或者在键值数据库之上编写 RDBMS 接口。
They can be complex structured data that needs deserialization. They can also be simple fixed-size records, just like your RDBMS. Part of the benefit is that you get to make that decision yourself. When you're optimizing your database, you're not limited to what SQL can do.
The way you ask makes it sound like the join or the deserialization will always be the bottleneck. But in any database, things are never that simple. You can put denormalized data in your RDBMS, too, or write an RDBMS interface on top of a key-value database, if you really want.