选择Mysql引擎来处理大的“类型值”数据桌子

发布于 2024-10-14 00:02:09 字数 811 浏览 8 评论 0原文

我的任务是从数据库中删除操作期间未受影响的所有实体。我创建了一个单独的表,其中有两列,第一列代表表的名称,第二列是该表中记录的 ID。

CREATE TABLE edited_entities (
        table VARCHAR(50) not null, 
        id BIGINT(20) not null)

例如,如果我有表

CREATE TABLE puppy(
        id BIGINT(20) not null, 
        name VARCHAR(20) not null)

和其中的记录,

id | name
1  | Rex

如果我编辑此记录,我会将以下数据放入edited_entities中:

table | id
puppy | 1

然后我需要删除所有不受影响的实体(哪些id不在edited_entities表中),我会执行以下操作:

delete from puppy where id not in 
    (select ee.id from edited_entities ee where ee.table= 'puppy');

我想知道对于这种操作(MySql)最好的引擎是什么?默认的数据库引擎是InnoDB。我考虑过内存(堆),但我不确定它是否可以加快删除操作。

如果您有建议我如何优化所需的操作,我将很高兴在这里提出。

我不想在小狗表中添加额外的列。

My task is to delete all entities that were not affected during an operation from the database. I created a separate table which have two columns, first represets the name of the table and second is the id of the record in that table.

CREATE TABLE edited_entities (
        table VARCHAR(50) not null, 
        id BIGINT(20) not null)

For example if I have table

CREATE TABLE puppy(
        id BIGINT(20) not null, 
        name VARCHAR(20) not null)

and a record in it

id | name
1  | Rex

If I edit this record i will put the following data into edited_entities:

table | id
puppy | 1

Then I need to delete all non affected entities (which ids are not in edited_entities table) and I do following:

delete from puppy where id not in 
    (select ee.id from edited_entities ee where ee.table= 'puppy');

I wonder what is the best engine for such kind of operation (MySql)? The default db engine is InnoDB. I thought about Memory (Heap) but I am not sure if it can faster the delete operation.

If you have suggestion how can I optimise the required operation I will be glad to here it.

I don't whant to add additional columns into puppy table.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

A君 2024-10-21 00:02:09

内存将会更快,因为它不必在事务结束时写入磁盘。在这种情况下,我会首先尝试 BTREE 而不是 HASH 索引,因为它允许您使用部分复合索引,就像磁盘表一样。

还可以尝试使用准备好的语句进行插入和删除操作:在处理之前准备每个语句之一,然后使用相关参数进行调用。它可能更快,因为它不需要解析 SQL;但在某些情况下,整个系统会变得有点慢,因为它们占用了大量的内存。

另一个选择是实验性的“HandlerSocket”功能,可在 MySQL 的多个分支中使用,例如Percona 服务器,它允许您将 MySQL 表作为 NoSQL 存储访问,具有巨大的性能优势,但完全符合 ACID 要求。 (Percona 人是性能狂热者;即使您不能使用 HandlerSocket,也一定要测试他们的分支)

最后,一个可行的(但对您来说更多的工作)选项是单独的内存数据库。我倾向于使用 Redis,它是一种非常高速的内存中键值存储,并且添加了扭曲“值”是有用的数据结构。在您的情况下,您可以为每个表存储一组 ID,就像

tokeep:puppy => 1,4,6,76.....

将元素添加到集合中一样简单(且原子)操作 (SADD tokeep:puppy 76),并且在最后,您将它们全部获取以创建 SQL DELETE WITH id NOT IN (...) 操作

而且,我想到的最后一个选项(对您来说还有更多工作)是将所有内容放入在同一家商店中:有一个 Redis 的分支,以前称为 redisql,但现在它是 Alchemy Database;它将 SQL 表添加到 Redis,保留了 NoSQL 的大部分性能优势。因此,您可以将“常规”表作为 Alchemy 中的 SQL 表,将“tokeep”集存储在同一服务器上的 NoSQL 上,最后执行以下操作:

DELETE FROM puppy WHERE id NOT IN ($SMEMBERS tokeep:puppy );

Bam!

Memory will be faster, since it doesn't have to go to disk at the end of the transaction. In this case I'd try first BTREE and not HASH index, because it lets you use partial composite indexes, just like on-disk tables.

Also try with prepared statements for insert and delete operations: prepare one of each before processing and then invoke with the relevant parameters. It might be faster since it doesn't have to parse the SQL; but there are some cases where the whole system gets a little slower because they take non-trivial amounts of memory.

Another option is the experimental 'HandlerSocket' feature, available in several forks of MySQL like Percona server, it lets you access a MySQL table as a NoSQL store, with huge performance benefits, but full ACID compliance. (The Percona people are performance fanatics; even if you can't use HandlerSocket, be sure to test their fork)

Finally, a viable (but more work for you) option is a separate in-memory database. I'd tend to go with Redis, which is a very high-speed in-memory key-value store with the added twist that the 'values' are useful data structures. In your case, you could store a set of IDs for each table, something like

tokeep:puppy => 1,4,6,76.....

it's a simple (and atomic) operation to add an element to the set (SADD tokeep:puppy 76), and at the end you fetch them all to create the SQL DELETE WITH id NOT IN (...) operation

And, the very last option I think of (and still more work for you), would be to put everything in the same store: there's a fork of Redis, previously called redisql, but now it's now Alchemy Database; it adds SQL tables to Redis, keeping most of the performance benefits of NoSQL. So, you could have your 'regular' tables as SQL tables in Alchemy, store your 'tokeep' sets on NoSQL on the same server, and at the end do a:

DELETE FROM puppy WHERE id NOT IN ($SMEMBERS tokeep:puppy );

Bam!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文