Mysql select on 索引列在大表上速度减慢

发布于 2024-11-14 03:29:57 字数 1415 浏览 6 评论 0原文

我有两个表:A - 301 列(第一个名为 a1 int(11) 主键,第二个到第 301 个 - double(15,11) )& B - 33 列(第一个 - b1 int(11) 唯一键,第二个 - b2 varchar(100) 主键,...,第 33 个 - b33 int(11) MUL )。

A 和 A 都B 有约 13,500,000 条记录。

我的 mysql 查询:对于 pos 的每个值,集合 (1, 1000, 2000, ..., 13500000) 中的 pos 为 1000 的倍数:

在 a1=b1 上从 A 连接 B 中选择 A.*、b2、b5、b7、b8、b10、b13、b33,其中 b33 >= pos 且 b33 <位置+1000;

对于 b33 <= 600,000 的值,查询需要 1-5 秒。此后,查询开始需要 20-30 秒。当 b33 >= 8,000,000 时,查询开始花费 60-70 秒。我不明白为什么会出现减速。 b33 已建立索引,并且连接发生在一个表中定义为主键且在另一个表中定义为唯一的键上。有解决方法吗?这确实阻碍了代码的速度,我将不得不拆分表 A 和表 A 。如果没有其他办法的话,B 分成几个较小的。我真的希望我不必这样做!请帮忙!

编辑:这是 EXPLAIN 的 o/p -

******** ***** 1. 行******< em>*******
编号:1
选择类型:简单
表:B
类型:范围
可能的键:b1,b33
键:b33
key_len: 4
参考:NULL
行:981
额外:使用where
************* > 2. 行*********** **
编号:1
选择类型:简单
表:A
类型:eq_ref
可能的键:主要
键:主要
key_len: 4
参考:DBName.B.b1
行:1
额外:
2 行一组(0.00 秒)

I have two tables : A - 301 columns ( 1st one named a1 int(11) Primary Key, 2nd to 301th - double(15,11) ) & B - 33 columns ( 1st one - b1 int(11) Unique Key, 2nd One - b2 varchar(100) Primary Key, ... , 33rd - b33 int(11) MUL ).

Both A & B have ~ 13,500,000 records.

My mysql query : For every value of pos, with pos in set (1, 1000, 2000, ..., 13500000) in multiples of 1000 :

select A.*, b2, b5, b7, b8, b10, b13, b33 from A join B on a1=b1 where b33 >= pos and b33 < pos+1000;

The query takes 1-5 seconds for values of b33 <= 600,000. After that the query starts taking 20-30 seconds. When b33 >= 8,000,000 the query starts taking 60-70s. I can't understand why the slowdown is happening. b33 is indexed and the join takes place on the key that is defined as primary in one table and unique in the other. Is there a workaround for this? This is really hampering the speed of the code and I will have to split the tables A & B into several smaller ones if nothing else works. I really hope I don't have to do that! Please help!

EDIT: Here is the o/p of EXPLAIN -

************* 1. row *************
id: 1
select_type: SIMPLE
table: B
type: range
possible_keys: b1,b33
key: b33
key_len: 4
ref: NULL
rows: 981
Extra: Using where
************* 2. row *************
id: 1
select_type: SIMPLE
table: A
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: DBName.B.b1
rows: 1
Extra:
2 rows in set (0.00 sec)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

避讳 2024-11-21 03:29:57

由于您的数据库有数百万条记录,您是否采取了措施来保持数据库的良好健康?

如果您的数据频繁更改(可能是大量插入?),每晚运行以下命令可能有助于提高总体响应能力:

mysqlcheck --check --analyze --auto-repair --all-databases --silent

虽然我建议在运行之前阅读一些关于mysqlcheck的内容命令,只是为了让你知道它在做什么。

您还应该查看优化您的InnoDB配置,特别是 innodb_buffer_pool_size (你可以给它的内存越多越好)。我在类似大小的表中的基于日期的字段(当然,我们立即索引)上遇到了类似的缓慢情况,并且将缓冲池大小从默认的 8 MB 增加到几 GB 产生了非常明显的差异。

如果要从连接涉及的任何表中删除许多行,您可以考虑运行 优化表

Since your database has several million records, are you doing anything to keep your DB in good health?

Running the following command nightly might help with general responsiveness if your data changes frequently (lots of inserts, maybe?):

mysqlcheck --check --analyze --auto-repair --all-databases --silent

Though I would suggest reading up a bit on mysqlcheck before running the command, just so you know what it's doing.

You should also look at optimizing your InnoDB configuration, especially innodb_buffer_pool_size (the more memory you can give it, the better). I was experiencing a similar slowness on a date-based field (which, of course, we immediately indexed) in a similarly sized table, and increasing the buffer pool size from the default 8 megabytes to several gigabytes made a very noticeable difference.

If you're deleting many rows from any table involved in the join, you might consider running OPTIMIZE TABLE as well.

待"谢繁草 2024-11-21 03:29:57

只是在黑暗中拍摄...

select A.*, b2, b5, b7, b8, b10, b13, b33 
  from A join B 
  on a1=b1 
  where b33 BETWEEN pos AND pos+999;

Just a shot in the dark...

select A.*, b2, b5, b7, b8, b10, b13, b33 
  from A join B 
  on a1=b1 
  where b33 BETWEEN pos AND pos+999;
笔芯 2024-11-21 03:29:57

我不是MySQL(或任何东西!)专家,但我会考虑一些事情。首先,b33分布均匀吗?可能会因为有效地检索更多行而变慢?
其次,您是否考虑过在单个查询中完成所有工作而不是 13500 次?类似于:

select A.*, b2, b5, b7, b8, b10, b13, b33, (b33 - 1 DIV 1000) the_group
from A join B on a1=b1 

第三,大胆猜测,如果您的MySQL版本支持它,请首先使用内联视图进行过滤:

select A.*, b2, b5, b7, b8, b10, b13, b33 
from A join (select b1,b2, b5, b7, b8, b10, b13, b33 
             from B b33 >= pos and b33 < pos+1000) B_NEW 
     on a1=b1 ;

第四(应该是第一个),做一个解释计划并尝试了解为什么查询比快速查询慢查询速度慢的查询。

祝你好运!!

I'm no MySQL(or anything!) guru but some things I would consider. First, is b33 evenly distributed? May be it's slower because is effectively retrieving more rows?
Second, have you considered to do all the work in a single query instead of 13500? Something like:

select A.*, b2, b5, b7, b8, b10, b13, b33, (b33 - 1 DIV 1000) the_group
from A join B on a1=b1 

Third, a wild guess, if your version of MySQL supports it, use an inlinew view to do the filtering first:

select A.*, b2, b5, b7, b8, b10, b13, b33 
from A join (select b1,b2, b5, b7, b8, b10, b13, b33 
             from B b33 >= pos and b33 < pos+1000) B_NEW 
     on a1=b1 ;

Fourth(should be first), do a explain plan and try to learn why the query is slow comparing the fast querys with the slow ones.

Good luck!!

你的背包 2024-11-21 03:29:57

您能向我们展示您在 B 上设置的索引吗? (对 b33 上的索引是如何定义的,以及它是在单个列上还是在多个列上定义感兴趣):

SHOW INDEXES FROM B;

当您仅从 B 中进行选择时,您是否看到相同的速度下降?

select b2, b5, b7, b8, b10, b13, b33 from B where b33 >= pos and b33 < pos+1000;

您能否向我们展示SHOW CREATE TABLE中涉及字段b33的部分(对允许的NULL感兴趣)

您是否使用MyISAM或InnoDB作为数据库引擎? (您可以在 SHOW CREATE TABLE 的结果中看到这一点)。

Can you show us the indices that you have setup on B? (interested in how the index on b33 is defined, and whether it is defined on a single column, or on multiple columns):

SHOW INDEXES FROM B;

Do you see the same speed decrease when you only select from B?

ie

select b2, b5, b7, b8, b10, b13, b33 from B where b33 >= pos and b33 < pos+1000;

Can you show us the part from the SHOW CREATE TABLE that involves field b33 (interested in NULL allowed)

Are you using MyISAM or InnoDB as database engine? (You can see this in the result of SHOW CREATE TABLE).

屋顶上的小猫咪 2024-11-21 03:29:57

解释计划和索引看起来不错。

我建议你比较一下配置文件,看看时间到底去了哪里:

SET profiling=1;

select A.*, b2, b5, b7, b8, b10, b13, b33 from A join B on a1=b1 where b33 >= 0 and b33 < 1000;
SHOW PROFILE;

select A.*, b2, b5, b7, b8, b10, b13, b33 from A join B on a1=b1 where b33 >= 1000000 and b33 < 1001000;
SHOW PROFILE;


SET profiling=0;

但我认为它可能会很慢,因为索引后 600k 不再适合内存,并且进行了更多的磁盘搜索

Explain plan and indexes seem fine.

I suggest you to compare the profiles and see where time really goes:

SET profiling=1;

select A.*, b2, b5, b7, b8, b10, b13, b33 from A join B on a1=b1 where b33 >= 0 and b33 < 1000;
SHOW PROFILE;

select A.*, b2, b5, b7, b8, b10, b13, b33 from A join B on a1=b1 where b33 >= 1000000 and b33 < 1001000;
SHOW PROFILE;


SET profiling=0;

but i think it maybe slow because of index post 600k does not fit into memory anymore and more disk seeks are made

烟雨扶苏 2024-11-21 03:29:57

您需要重构此查询!

这是您的旧查询:

select A.*, b2, b5, b7, b8, b10, b13, b33
from A join B on a1=b1 where b33 >= pos and b33 < pos+1000;

这是新查询:

SELECT
    AAA.*,b2,b5,b7,b8,b10,b13,b33
FROM
    A AAA INNER JOIN
    (
        select
            A.a1,b2,b5,b7,b8,b10,b13,b33
        from
            A INNER JOIN
            (
               SELECT
                   b1,b2,b5,b7,b8,b10,b13,b33
               FROM B
               WHERE
                    b33 >= pos and
                    b33 < pos+1000
            ) BB
            ON A.a1=B.b1
    ) BBB
    USING (a1)
;

CAVEAT

此重构查询的目标是使查询计划中的临时表尽可能小。事实上,子查询 BBB 在任何给定时间都不应该超过 1000 行

尝试一下!

YOU NEED TO REFACTOR THIS QUERY !!!

Here is your old query :

select A.*, b2, b5, b7, b8, b10, b13, b33
from A join B on a1=b1 where b33 >= pos and b33 < pos+1000;

Here is the new one :

SELECT
    AAA.*,b2,b5,b7,b8,b10,b13,b33
FROM
    A AAA INNER JOIN
    (
        select
            A.a1,b2,b5,b7,b8,b10,b13,b33
        from
            A INNER JOIN
            (
               SELECT
                   b1,b2,b5,b7,b8,b10,b13,b33
               FROM B
               WHERE
                    b33 >= pos and
                    b33 < pos+1000
            ) BB
            ON A.a1=B.b1
    ) BBB
    USING (a1)
;

CAVEAT

The goal of this refactored query is to make the temp tables within the query plan as small as possible. In fact, subquery BBB should never have more than 1000 rows at any given time.

Give it a Try !!!

世俗缘 2024-11-21 03:29:57

ayesha129p,

尝试将 b33 约束移至 join 子句中。听起来优化器只应用了连接集创建前的 b33 约束之一。

select A.*, b2, b5, b7, b8, b10, b13, b33 from A join B
  on a1=b1 and b33 >= pos and b33 < pos+1000;

这样,优化器应在尝试连接之前使用 b33 索引并将 B 行设置减少到 1000。

ayesha129p,

try moving the b33 constraints into the join clause. It sounds like the optimizer is only applying one of the b33 constraints pre-join-set-creation.

select A.*, b2, b5, b7, b8, b10, b13, b33 from A join B
  on a1=b1 and b33 >= pos and b33 < pos+1000;

This way the optimizer should use the b33 index and reduce the B row set to 1000 before attempting the join.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文