尽管 EXPLAIN 显示了良好的计划,但 MySQL 的连接查询速度很慢

发布于 2024-09-14 18:43:50 字数 1315 浏览 1 评论 0原文

我有以下场景:在 MySQL 数据库中,我有 2 个 MyISAM 表,一个有 420 万行,另一个有 3.2 亿行。以下是表的架构:

Table1(4.2M 行)

F1 INTEGER UNSIGNED NOT NULL PRIMARY KEY
f2 varchar(40)
f3 varchar(40)
f4 varchar(40)
f5 varchar(40)
f6 smallint(6)
f7 smallint(6)
f8 varchar(40)
f9 varchar(40)
f10 smallint(6)
f11 varchar(10)
f12 tinyint(4)
f13 smallint(6)
f14 text

Table2(320M 行)

F1 INTEGER UNSIGNED NOT NULL PRIMARY KEY
f2 INTEGER UNSIGNED NOT NULL

Table2 位于不同的数据库中,但我使用查询这两个表的存储过程。两个表之间的关系是,对于Table1.F1,最多可能有大约。 Table2.F1(外键)中有 100 行匹配,并且将为这些匹配的键返回 Table2.f2 的值。 我在 Table1 上有一个索引 IX1(f2(15),f3(10)) ,在表 2 中有一个索引 IX2(F1,f2) 和 IX3(f2)

我正在运行的查询如下:

SELECT g.F1
FROM DB1.Table1 g 
INNER JOIN DB2.Table2 gp ON g.F1 = gp.F1 
WHERE (gp.f2 = 452677825) AND
(g.f2 = 'A string value') LIMIT 0,56

此查询有时非常快(<1s),但更改 g.F2 比较的字符串值会导致查询甚至需要 11 秒以上,有时甚至 30 秒。我不明白为什么会这样。以下是执行的 SELECT 上的 EXPLAIN 的输出。

1, 'SIMPLE', 'g', 'ref', 'PRIMARY,IX1', 'IX1', '17', 'const', 901, 'Using where'
1, 'SIMPLE', 'gp', 'ref', 'IX3,IX2', 'IX2', '8', 'DB1.g.F1,const', 1, 'Using index'

这似乎是一个相当不错的执行计划。解释的顶行中的行数最多为 2000,但我不明白为什么这需要比几分之一秒更长的时间才能返回结果。我还对查询运行了探查器,并注意到查询 99.9% 的时间都花在“发送数据”阶段。谁能解释一下为什么会这样,以及可以采取什么措施来优化查询?

提前致谢, 蒂姆

I have the following scenario: In a MySQL database, I have 2 MyISAM tables, one with 4.2 million rows, and another with 320 million rows. The following is the schema for the tables:

Table1 (4.2M rows)

F1 INTEGER UNSIGNED NOT NULL PRIMARY KEY
f2 varchar(40)
f3 varchar(40)
f4 varchar(40)
f5 varchar(40)
f6 smallint(6)
f7 smallint(6)
f8 varchar(40)
f9 varchar(40)
f10 smallint(6)
f11 varchar(10)
f12 tinyint(4)
f13 smallint(6)
f14 text

Table2 (320M rows)

F1 INTEGER UNSIGNED NOT NULL PRIMARY KEY
f2 INTEGER UNSIGNED NOT NULL

Table2 is in a different database but I am using a stored procedure which queries the two tables. The relation between the two tables is that for Table1.F1 there may be up to approx. 100 rows in Table2.F1 (foreign key) which match, and the value for Table2.f2 will be returned for these matched keys.
I have an index IX1(f2(15),f3(10)) on Table1 and an index IX2(F1,f2) and IX3(f2) in Table 2

The queries I am running are the following:

SELECT g.F1
FROM DB1.Table1 g 
INNER JOIN DB2.Table2 gp ON g.F1 = gp.F1 
WHERE (gp.f2 = 452677825) AND
(g.f2 = 'A string value') LIMIT 0,56

This query is sometimes very fast (<1s) but changing the string value that g.F2 is compared to leads to queries which take even over 11 and sometimes even 30 seconds. I cannot understand why this is so. The following is the output of the EXPLAIN on the SELECT that is executed.

1, 'SIMPLE', 'g', 'ref', 'PRIMARY,IX1', 'IX1', '17', 'const', 901, 'Using where'
1, 'SIMPLE', 'gp', 'ref', 'IX3,IX2', 'IX2', '8', 'DB1.g.F1,const', 1, 'Using index'

which seems to be quite a good execution plan. The number of rows in the top row of the explain goes to 2000 at most, but I do not see why this should take any longer than a fraction of a second to return results. I also ran profiler on the query and noticed that the queries are spending 99.9% of the time on the "Sending data" stage. Can anyone please explain why this is so, and what can be done to optimise the query?

Thanks in advance,
Tim

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

鹤舞 2024-09-21 18:43:50

我不是这方面的专家,但这里有一些想法:

g.F2 更改时查询速度需要更长的时间是因为缓存。 MySQL 会保存每个查询的结果(直到缓存已满),但新查询在空缓存上运行,因此需要更长的时间。您不应该基于此进行优化。 (参见如何测量准确

我无法从您的信息中判断出ggp表是否具有更大的特异性(看起来像gp?)在 where 子句中,但您可能想尝试使用子查询。 (参见如何强制首先执行内部查询

关于分析,您可能会遇到物理阈值,例如超出内存分配(使用交换对性能来说是灾难性的),这在解释,或者在这种情况下 explain 是否只是错误的。

I'm not an expert in this area, but here are a few thoughts:

Query speed taking longer when g.F2 changes is because of caching. MySQL will save the results for each query (until the cache is full), but new queries are run on an empty cache, so they take longer. You shouldn't optimize based on this. (See How to measure accurately)

I can't tell from your information whether the g or gp table has greater specificity (seems like gp?) in the where clause, but you may want to try a subquery instead. (See How to force the inner query to execute first)

Regarding profiling, it's possible you're hitting a physical threshold like exceeding ram allocation (using swap is disastrous for performance) that would not be obvious from explain, or whether explain is just wrong in this case.

月依秋水 2024-09-21 18:43:50

如果可以的话,您可能想尝试调整 my.cnf,您想要使用的属性是 key_buffer_size。 MyISAM 索引存储在 .MYI 文件中,如果您找到这些文件并合计文件大小(例如 ls -lh /var/lib/mysql/dbname/*.MYI),您可以粗略估计密钥缓冲区需要多大才能适应MySQL 文档确实建议不要超过系统内存的 25%。

If you are able to you may want to try tweaking your my.cnf, the property you want to play with is key_buffer_size. MyISAM indexes are stored in .MYI files if you locate these and total up the file sizes (e.g. ls -lh /var/lib/mysql/dbname/*.MYI) you can roughly estimate how big the key buffer needs to be to fit all of your indexes in. The MySQL docs does recommend not to exceed 25% of system memory though.

梦太阳 2024-09-21 18:43:50

两个表之间的关系是,对于Table1.F1,最多可能有大约。 Table2.F1 中有 100 行

澄清一下,Table1.F1Table2.F1 之间的关系是一对一还是一对多?对我来说,这个语句意味着一对多,但从模式来看,每个字段都是主(即唯一)键。

无论如何,我怀疑 g.f2(15) 的统一并不统一,并且当遇到统计异常值时,性能会相应下降。

结果是否

SELECT f2(15) AS f2_15, COUNT(*) AS cnt
FROM Table1
GROUP BY f2(15) 
ORDER BY cnt DESC

显示出一些显着的异常值?

The relation between the two tables is that for Table1.F1 there may be up to approx. 100 rows in Table2.F1

To clarify, is the relationship between Table1.F1 and Table2.F1 one-to-one, or one to many? To me, this statement implies one-to-many, but from the schema, each of the fields are primary (i.e. unique) keys.

At any rate, I suspect that the uniform of g.f2(15) is not uniform, and that when the statistical outliers are hit, performance degrades accordingly.

Do the results of

SELECT f2(15) AS f2_15, COUNT(*) AS cnt
FROM Table1
GROUP BY f2(15) 
ORDER BY cnt DESC

show some significant outliers?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文