数据库分片策略

发布于 2024-08-13 08:43:25 字数 291 浏览 8 评论 0原文

对于正在建设的在线市场产品,我遇到了需要实施数据库分片解决方案的情况。我是分片新手,在阅读了本论坛中的帖子后,我觉得使用业务实体的基于目录的分片策略将是合适的。但我仍然不清楚这种分片解决方案采用的非规范化和数据同步最佳实践。 将有 3 个核心实体:供应商、客户和订单。我计划根据供应商 ID 对数据库进行分片,因为订单数据的大部分处理将由供应商管理员执行。这将确保从单个数据库实例中获取供应商的订单,从而消除跨数据库获取。但是,在这种情况下,当客户查看其订单信息时,数据将驻留在多个数据库实例中,并且需要多个数据库获取。当分片解决方案中出现此类场景时,通常会做什么。

For an online marketplace product under construction, I have a situation which requires implementing a database sharding solution. I am new to sharding and after reading the posts in this forum I feel a directory based sharding strategy using business entities will be suitable. But I am still not clear about the denormalization and data synchronization best practices to adopt with such a sharded solution.
There will be 3 core entities, supplier, customer and order. I am planning to shard the database based on the supplier id as most of the processing on the order data will be carried out by the supplier admins. This will ensure that the orders for a supplier are fetched from a single db instance eliminating cross db fetches. However, in this case, when the customers view their order information that data will be residing in multiple db instances and will require multi database fetching. What is typically done when such scenarios come up in a sharded solution.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

十六岁半 2024-08-20 08:43:25

我认为有 99.9% 的可能性不需要分片。

如果满足以下条件,则需要分片:

  • 您的数据库插入/更新速率接近或超过您可以经济有效地购买的最高规格服务器的容量并且
  • 您已经将大部分读取查询、报告、备份等外包给只读复制从属服务器
  • 您已经完成了功能分区,以将任何非必要或不相关的更新繁重的工作负载从主服务器上移走

如果您不能明确对上述所有三个问题说“是”,则不需要分片。

阅读

http://www.mysqlperformanceblog.com /2009/08/06/为什么你不想分片/

I think there is a 99.9% chance that you do not need sharding.

You need sharding if:

  • Your database insert /update rate is close to, or is exceeding, the capacity of the highest spec server you can cost-effectively buy AND
  • You are already farming out most of your read queries, reporting, backups etc on to read-only replicated slaves
  • You have done functional partitioning to move any nonessential or unrelated update-heavy workloads off your master server

If you cannot definitely say "yes" to all three of the above, you do not need to shard.

Read

http://www.mysqlperformanceblog.com/2009/08/06/why-you-dont-want-to-shard/

孤独陪着我 2024-08-20 08:43:25

即使在数据库大小达到多个 TB 之前,数据库分片也非常有效。我们发现的主要原因是内存/CPU与磁盘的比率发生了显着变化,而MySQL等DBMS产品非常擅长将最近使用的索引和数据放入内存中。

对于您的数据分片问题,此技术可能会有所帮助。

  • 并行查询(我们称之为“Go Fish”查询)。有了这个想法,您可以同时从多个分片查询客户订单,并整合结果。如果做得正确,这会非常有效。

对于变化不大的数据,我们通常建议对公共查找表进行全局表复制,但这对于像客户订单这样活跃的数据没有多大帮助。

无论如何,分片可以以非常经济高效的方式实现,并且可以针对写入进行线性扩展,并且通常比基于上述的线性读取更好。

Database Sharding can be extremely effective, even before your database gets to be multiple TBs in size. The main reason we have found is because the ratio of memory/CPU to disk alters markedly, and the DBMS products such as MySQL are really excellent at putting most recently used indexes and data into memory.

For your data sharding problem, this technique may help.

  • Parallel Query (we call it "Go Fish" queries). With this idea you can query your customer orders from multiple shards at the same time, and consolidate the results. If its done right this can be very efficient.

For data that doesn't change as much we often recommend Global Table replication for common lookup tables, but that will not help much with something as active as Customer Orders.

In any case, sharding can be implemented in a very cost-effective manner, and can scale linearly for writes, and often better than linearly for reads based on the above.

感情洁癖 2024-08-20 08:43:25

您可能还想尝试 nosql 数据库,例如 mongodb 或 Cassandra

您还可以使用 memcache 来缓存数据以实现快速访问

您还可以研究具有多个从属的主从复制。

You may also want to try out nosql DBs such as mongodb or Cassandra

You can also use memcache to cache data for fast access

You can also look into master slave replication with multiple slaves.

弃爱 2024-08-20 08:43:25

对于关系数据库,Apache ShardingSphere可以帮助您透明地进行数据分片。

它可以使用内置的分片算法和开发人员定义的自定义算法对数据进行分片。

只需使用CREATE SHARDING RULE TABLE t_order ...添加分片规则即可,其他SQL与原数据库相同。

仅供参考: https://shardingsphere.apache.org/document/current/en /功能/分片/

For relational database, Apache ShardingSphere may help you to do data sharding transparently.

It can sharding the data with built-in sharding algorithm and customization algorithm defined by developers.

Just use CREATE SHARDING RULE TABLE t_order ... to add the sharding rule, and other SQL is same with original database.

FYI: https://shardingsphere.apache.org/document/current/en/features/sharding/

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文