使用子查询在 mysql 库上删除非常慢

发布于 2024-12-03 21:26:51 字数 2176 浏览 2 评论 0原文

这个 mysql 查询运行了大约 10 个小时,还没有完成。有什么事情非常不对劲。

这里有两个表(文本和垃圾邮件)。垃圾邮件将垃圾邮件条目的 ID 存储在我要删除的文本中。

DELETE FROM tname.text WHERE old_id IN (SELECT textid FROM spam);

spam 只有 2 列,都是整数。 800K 条目的文件大小为几 Mb。两个 int 都是主键。

文本有 3 列。 id(主键)、文本、标志。大约 1200K 条目,大约 2.1 GB 大小(大多数垃圾邮件)。

服务器是 xeon 四核、2 GB 内存(不要问我为什么)。只有 apache(为什么?)和 mysqld 正在运行。它是一个旧的免费bsd和mysql 4.1.2(不要问我为什么)

线程:6问题:188805慢查询:318打开:810刷新表:1打开表:157每秒查询平均:7.532

Mysql my.cnf:

[mysqld]
datadir=/usr/local/mysql
log-error=/usr/local/mysql/mysqld.err
pid-file=/usr/local/mysql/mysqld.pid
tmpdir=/var/tmp
innodb_data_home_dir =
innodb_log_files_in_group = 2
join_buffer_size=2M
key_buffer_size=32M
max_allowed_packet=1M
max_connections=800
myisam_sort_buffer_size=32M
query_cache_size=8M
read_buffer_size=2M
sort_buffer_size=2M
table_cache=256
skip-bdb
log-slow-queries = slow.log
long_query_time = 1

#skip-innodb
#default-table-type=innodb
innodb_data_file_path = /usr/local/mysql/ibdata1:10M:autoextend
innodb_log_group_home_dir = /usr/local/mysql/
innodb_buffer_pool_size = 128M
innodb_log_file_size = 16M
innodb_log_buffer_size = 8M
#innodb_flush_log_at_trx_commit=1
#innodb_additional_mem_pool_size=1M
#innodb_lock_wait_timeout=50

log-bin
server-id=201

[isamchk]
key_buffer_size=128M
read_buffer_size=128M
write_buffer_size=128M
sort_buffer_size=128M

[myisamchk]
key_buffer_size=128M[server:~] dmesg | grep memory
real memory  = 2146828288 (2047 MB)
avail memory = 2095534080 (1998 MB)

read_buffer_size=128M
write_buffer_size=128M
sort_buffer_size=128M
tmpdir=/var/tmp

该查询仅使用一个 cpu,top 表示 25% cpu 时间(所以 1 of 4)。

real memory  = 2146828288 (2047 MB)
avail memory = 2095534080 (1998 MB)

62 processes:  2 running, 60 sleeping
CPU states: 25.2% user,  0.0% nice,  1.6% system,  0.0% interrupt, 73.2% idle
Mem: 244M Active, 1430M Inact, 221M Wired, 75M Cache, 112M Buf, 31M Free
Swap: 4096M Total, 1996K Used, 4094M Free

  PID USERNAME     THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
11536 mysql         27  20    0   239M   224M kserel 3 441:16 94.29% mysqld

知道如何修复它吗?

This mysql query is running for around 10 hours and has not finished. Something is horribly wrong.

Two tables (text and spam) are here. Spam stores the ids of spam entrys in text that I want to delete.

DELETE FROM tname.text WHERE old_id IN (SELECT textid FROM spam);

spam has just 2 columns, both are ints. 800K entries has a file size of several Mbs. Both ints are primary keys.

text has 3 columns. id (prim key), text, flags. around 1200K entries, and around 2.1 gigabyte size (most spam).

The server is a xeon quad, 2 gigabyte ram (don't ask me why). Only apache (why?) and mysqld is running. Its an old free bsd and mysql 4.1.2 (don't ask me why)

Threads: 6 Questions: 188805 Slow queries: 318 Opens: 810 Flush tables: 1 Open tables: 157 Queries per second avg: 7.532

Mysql my.cnf:

[mysqld]
datadir=/usr/local/mysql
log-error=/usr/local/mysql/mysqld.err
pid-file=/usr/local/mysql/mysqld.pid
tmpdir=/var/tmp
innodb_data_home_dir =
innodb_log_files_in_group = 2
join_buffer_size=2M
key_buffer_size=32M
max_allowed_packet=1M
max_connections=800
myisam_sort_buffer_size=32M
query_cache_size=8M
read_buffer_size=2M
sort_buffer_size=2M
table_cache=256
skip-bdb
log-slow-queries = slow.log
long_query_time = 1

#skip-innodb
#default-table-type=innodb
innodb_data_file_path = /usr/local/mysql/ibdata1:10M:autoextend
innodb_log_group_home_dir = /usr/local/mysql/
innodb_buffer_pool_size = 128M
innodb_log_file_size = 16M
innodb_log_buffer_size = 8M
#innodb_flush_log_at_trx_commit=1
#innodb_additional_mem_pool_size=1M
#innodb_lock_wait_timeout=50

log-bin
server-id=201

[isamchk]
key_buffer_size=128M
read_buffer_size=128M
write_buffer_size=128M
sort_buffer_size=128M

[myisamchk]
key_buffer_size=128M[server:~] dmesg | grep memory
real memory  = 2146828288 (2047 MB)
avail memory = 2095534080 (1998 MB)

read_buffer_size=128M
write_buffer_size=128M
sort_buffer_size=128M
tmpdir=/var/tmp

The query is using just one cpu, top says 25% cpu time (so 1 of 4).

real memory  = 2146828288 (2047 MB)
avail memory = 2095534080 (1998 MB)

62 processes:  2 running, 60 sleeping
CPU states: 25.2% user,  0.0% nice,  1.6% system,  0.0% interrupt, 73.2% idle
Mem: 244M Active, 1430M Inact, 221M Wired, 75M Cache, 112M Buf, 31M Free
Swap: 4096M Total, 1996K Used, 4094M Free

  PID USERNAME     THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
11536 mysql         27  20    0   239M   224M kserel 3 441:16 94.29% mysqld

Any idea how to fix it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

一场春暖 2024-12-10 21:26:51

根据我的经验,子查询通常是 SQL 语句执行速度缓慢的原因,因此我尽量避免使用它们。试试这个:

DELETE tname FROM tname INNER JOIN spam ON (tname.old_id = spam.textid);

免责声明:此查询未经测试,请先进行备份! :-)

In my experience sub queries are often a cause of slow execution times in SQL statements, therefor I try to avoid them. Try this:

DELETE tname FROM tname INNER JOIN spam ON (tname.old_id = spam.textid);

Disclaimer: This query is not tested, make backups first! :-)

梦回旧景 2024-12-10 21:26:51

您选择的 where id in (select ...) 将始终表现不佳。

相反,使用非常有效的普通连接:

DELETE `text` 
FROM spam
join `text` on `text`.old_id = spam.textid;

首先注意从垃圾邮件中进行选择,然后连接到文本,这将提供最佳性能。

Your choice of where id in (select ...) will always perform poorly.

Instead, use a normal join which will be very efficient:

DELETE `text` 
FROM spam
join `text` on `text`.old_id = spam.textid;

Notice selection from spam first, then joining to text, which will give the best performance.

如痴如狂 2024-12-10 21:26:51

当然,这会花费很多时间,因为它对每条记录都执行子查询,但是直接使用 INNER JOIN 这个查询只执行一次
让我们假设该查询将执行

10 ms for 50000 rec  full time = 50000 * 10 ms ---> 8.333 minutes !! at least don't forget the condition and deleting time .....

,但使用 join 查询将仅执行一次:

DELETE t FROM tname.text t INNER JOIN (SELECT textid FROM spam) sq on t.old_id = sq.textid ;

of corse it will take a lot of time because it execute the subquery for every record but by using INNER JOIN directly this query is executed only one time
lets think that the query will take

10 ms for 50000 rec  full time = 50000 * 10 ms ---> 8.333 minutes !! at least don't forget the condition and deleting time .....

but using join the query will be executed only one time :

DELETE t FROM tname.text t INNER JOIN (SELECT textid FROM spam) sq on t.old_id = sq.textid ;
渡你暖光 2024-12-10 21:26:51

将不属于 spam 表单 text 的行复制到新表中。然后删除 text 表并重命名创建的表。
好主意是不要向创建的表添加任何键。重命名后添加密钥。

Copy rows that are not in spam form text to new table. Then delete text table and rename created table.
Good idea is not to add any keys to created table. Add keys after renaming.

寄人书 2024-12-10 21:26:51

我认为您可能想用 LIMIT 将删除分块,并且您可能想在 JOIN 中进行删除。我在这篇文章中详细介绍了这一点,特别有助于归档数据和删除不再需要的行。

https://shatteredsilicon.net/blog/2021/07/12/mariadb-mysql-performance-tuning-optimization-how-to-delete-faster-on-mysql/

I think you might want to chunk the deletes down with a LIMIT and you might want to do that delete in a JOIN. I wrote a bit more about this in this article which helps specifically with archiving of data and deleting no longer needed rows.

https://shatteredsilicon.net/blog/2021/07/12/mariadb-mysql-performance-tuning-optimization-how-to-delete-faster-on-mysql/

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文