MySQL:分区是处理删除的好方法吗?
我有一个 MySQL 表:
CREATE TABLE responses (
id INT NOT NULL AUTO_INCREMENT,
other_id INT NOT NULL,
details TEXT,
deleted BOOLEAN,
PRIMARY KEY (id)
);
用户可以删除响应
中的记录。
我的计划是使用字段deleted
来实现删除。每当用户删除记录时,我都会将 deleted
设置为 1
。
有时我可能想删除所有已删除的记录或将它们存档。我正在考虑使用分区来加快速度:
PARTITION BY LIST(deleted) (
PARTITION pActive VALUES IN (0),
PARTITION pDeleted VALUES IN (1)
);
我的问题是这会使删除操作变慢吗?现在,当我更改记录的“已删除”字段时,MySQL 需要将记录移动到完全不同的分区。看来这可能会很慢。
任何建议将不胜感激。
I have an MySQL table:
CREATE TABLE responses (
id INT NOT NULL AUTO_INCREMENT,
other_id INT NOT NULL,
details TEXT,
deleted BOOLEAN,
PRIMARY KEY (id)
);
Users could delete records in responses
.
My plan is to use the field deleted
to implement deletion. Any time a user deletes a record, I'll set deleted
to 1
.
Every now and then I might want to get rid of all the deleted records or to archive them. I'm thinking of using partitioning to make that faster:
PARTITION BY LIST(deleted) (
PARTITION pActive VALUES IN (0),
PARTITION pDeleted VALUES IN (1)
);
My question is would this make the act of deletion slower? Now that when I change a record's "deleted" field, MySQL would need to move the record to an entirely different partition. That seems like it could be slow.
Any suggestions would be greatly appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
是的,我希望两个状态之间的转换会更慢,以便在分区之间传递。但是,对现有值的已删除/未删除查询会更快,但不涉及删除状态的查询不会得到改进。
这都是关于表中最常见的操作是什么,并接受可能存在的妥协。
Yes, I'd expect the transition between the two states to be slower for sake of being passed between the partitions. But deleted/non-deleted queries on existing values would be faster, though queries that don't involve the deletion status wouldn't be improved.
It's all about what is the most common operation in the table, and accepting that there can be compromises.
我过去曾在一个项目中使用过这种方法,但我个人的感觉是这不是最好的方法。我觉得还是删除记录比较好。当您拥有这样的标志时,使用数据库的每个人都必须了解表中存在的记录可能不是“真实”记录,具体取决于是否设置了已删除标志。在我看来,它只会使数据库变得不那么直观,并且更难使用。
如果您担心性能,我会考虑正确分配表空间,并且您仍然可以采用分区方案。您可以按年和月对数据进行分区(如果您需要这种粒度级别)以帮助提高性能。
但我会避免删除标志。在我从事的项目中,这真是一个令人头疼的问题。例如,如果有人尝试插入另一条与“已删除”记录完全相同的记录(此处删除意味着已删除标志为 true)怎么办?您是否将现有记录设置为 false,或者插入另一个全新的记录?如果插入一条全新的记录,既然现在有 2 条记录具有相同的键,那么如何在表上定义主键?您是否将
deleted
作为密钥的一部分?关键是你必须处理所有这些类型的重要问题。I've been on a project in the past where this approach was used, and my personal feeling is that it's not the best approach. I think it's better to just delete the records. When you have a flag like this, everyone who uses your database has to understand that records that exist in the table might not be "real" records, depending on whether or not the deleted flag is set. It just makes the database less intuitive, and harder to use in my opinion.
If you're concerned about performance, I'd look at properly allocating your tablespaces, and you can still employ a partitioning scheme. You could partition the data by year and month (if you need that level of granularity) to help with the performance.
But I would avoid the deleted flag. On the project I worked on, it just became a real headache. For example, what if somebody tries to insert another record exactly the same as the one that was "deleted" (deleted here means deleted flag is true). Do you set deleted to false on the existing record, or do you insert another brand new record? If you insert a brand new record, how do you define your primary key on the table since now you have 2 records with the same key? Do you make
deleted
part of the key? The point is that you have to deal with all those types of non-trivial issues.