分片和标准化是相互排斥的吗?
我遇到了一个评论,这让我想知道:如果你使用数据库可扩展性的分片方法,这是否意味着数据库是非规范化的?你能有一个规范化的分片数据库吗?
I ran across a comment that made me wonder: If you use a sharding approach to db scalability, does that mean the database is denormalized? Can you have a normalized, sharded database?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
它们并不相互排斥。在扩展大规模数据集时,两者经常被使用,但其中一个与另一个并没有太大关系。您绝对可以拥有分片的、规范化的数据库……或非规范化的、非分片的数据库。
在分片中,您只需采用给定的模式(标准化或非标准化)并将其分布在多个物理/逻辑数据存储中。例如,这允许您让具有特定特征(例如“AD”中的姓氏)的所有用户居住在给定的数据库实例上。请注意,如何对数据库进行分片是一个至关重要的决定,并且高度依赖于实现。
另一方面,非规范化可以在有或没有分片数据库的情况下完成,旨在通过减少回答特定问题所需的连接/子查询来简化查询。当然,那么您通常会以编程方式维护数据完整性。
一些不错的读物:
分片理论与分片实践
一些高度可扩展的数据库实现“在野外”
The are not mutually exclusive. Both are often used when scaling massive datasets, but one doesn't really have much to do with the other. You can absolutely have a sharded, normalized database...or a denormalized, nonsharded database.
In sharding, you're just taking a given schema (normalized or not) and distributing it across a number of physical/logical data stores. This allows, for example, you to have all your users with a particular characteristic (e.g., last name in 'A-D') to live on a given database instance. Note that HOW you shard your database is a crucial decision and will be highly implementation dependent.
Denormalization, on the other hand, can be done with or without a sharded database and is intended to simply queries by reducing the joins/subqueries needed to answer a particular question. Of course, then you would typically programmatically maintain data integrity.
Some good reading:
Sharding theory & practice
Some highly-scalable database implementations 'in the wild'