将数据行移动到另一个分片的最佳方法?
这个问题说明了一切。
示例:我计划对数据库表进行分片。 该表包含标记为“活动”、“完成”和“已删除”的客户订单。 我也有三个碎片,每个标志一个。
据我了解,当标志更改时,必须将一行移动到右侧分片。
我对吗? 最好的方法是什么? 可以使用触发器吗?
我考虑过不立即移动该行,而只是在一天/一周/一个月结束时移动该行,但随后还不确定,具有特定标志的行驻留在哪个分片中,并且必须始终在所有分片上进行搜索。
编辑:一些澄清:
一般来说,我必须选择一个标准来决定行驻留在哪个分片中。 在这种情况下,我希望它是上面描述的标志,因为这是分割此类数据的最自然的方式。 (我认为)经常访问的活跃订单数量有限。 有大量已完成的订单,很少被访问,并且有大量的数据行几乎从未被访问。
如果我现在想要特定数据行所在的位置,我不必搜索所有分片。 如果用户想要加载活动订单,我已经知道我必须在哪个数据库中查找。
现在,我的分片标准标志发生了变化,我想知道处理这种情况的最佳方法。 如果我只是将记录保留在其原始数据库中,最终所有数据都会累积在一个表中。
The question says it all.
Example: I'm planning to shard a database table. The table contains customer orders which are flagged as "active", "done" and "deleted". I also have three shards, one for each flag.
As far as I understand a row has to be moved to the right shard, when the flag is changed.
Am I right?
What's the best way to do this?
Can triggers be used?
I thought about not moving the row immediately, but only at the end of the day/week/month, but then it is not determined, in which shard a rows with a specific flag resides and searches have to be done always over all shards.
EDIT: Some clarification:
In general I have to choose on a criterum to decide, in which shard a row resides. In this case I want it to be the flag described above, because it's the most natural way to shard this kind of data. (In my opinion) There is only a limited number of active orders which is accessed very often. There is a large number of finished orders, which are seldom accessed and there's a very huge number of data rows which are almost never accessed.
If I want to now where a specific data row resides I dont have to search all shards. If the user wants to load an active order, I know already in which database I have to look.
Now the flag, which is my sharding criterium, changes and I want to know the best way to deal with this case. If I'd just keep the record in its original database, eventually all data would accumulate in a single table.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在我看来,将所有活动记录保存在单个分片中可能不是一个好主意。 在这种分片策略中,所有 IO 都将在单个数据库实例上执行,而其他所有实例都未得到充分利用。
替代分片策略可以是使用某种哈希函数在分片之间分配新创建的行。 这将允许
In my opinion keeping all active record in single shard may not be a good idea. In such sharding strategy all IOs will be performed on single database instance leaving all other highly underutilized.
Alternate sharding strategy can be to distribute the newly created rows among the shards using some kind of hash function. This will allow
分片通常是指将它们分离到不同服务器上的不同数据库中。 Oracle 可以使用称为分区表的功能来完成您想要的操作。
如果您使用触发器(after/before_update/insert),这将是立即移动,其他方法将导致第一个分片(活动)中具有不同类型的数据,直到它被清理。
我还建议按日期执行此操作(例如每月一次的工作,将任何不活动且超过一个月的内容移动到另一个“存档”数据库)。
如果您这样做是为了提高性能,我想请您重新考虑这样做(除非您在此表中有 TB 的数据)。 请告诉我们您为什么要分片,我们都会想办法解决您的问题。
Sharding usually refer to separating them in different databases on different servers. Oracle can do what you want using a feature called partitioned tables.
If you're using triggers (after/before_update/insert), it would be an immediate move, other methods would result in having different types of data in the first shard (active), until it is cleaned-up.
I would also suggest doing this by date (like a monthly job that moves anything that's inactive and older than a month to another "Archive" Database).
I'd like to ask you to reconsider doing this if you're doing it to increase performance (Unless you have terabytes of data in this table). Please tell us why you want to shard and we'll all think about ways to solve your problem.