如何使用这个基本的内连接避免全表扫描？

发布于 2024-08-16 02:46:49 字数 1541 浏览 4 评论 0原文

我有一个表，该表具有指向存储一些 blob 数据的表的外键。当我对主表上的条件进行内部联接时，联接类型从“索引”变为“全部”。我想避免这种情况，因为我的 blob 表约为数十 GB。我怎样才能避免它？

这是基本的内部联接：

EXPLAIN SELECT m.id, b.id, b.data 
        FROM metadata m, blobstore b 
        WHERE m.fkBlob = b.id;

1, 'SIMPLE', 'm', 'index', 'fk_blob', 'fk_blob', '4', '', 1, 'Using index'
1, 'SIMPLE', 'b', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'blob_index.m.fkBlob', 1, ''

这里我在主表上添加了一个条件：

EXPLAIN SELECT m.id, b.id, b.data 
        FROM metadata m, blobstore b 
        WHERE m.fkBlob = b.id AND m.start < '2009-01-01';
1, 'SIMPLE', 'b', 'ALL', 'PRIMARY', '', '', '', 1, ''
1, 'SIMPLE', 'm', 'ref', 'fk_blob,index_start', 'fk_blob', '4', 'blob_index.b.id', 1, 'Using where'

请注意，表的列出顺序已更改。由于我添加了有关主表的条件，它现在正在对 blob 表进行全表扫描。

这是架构：

 DROP TABLE IF EXISTS `blob_index`.`metadata`;
    CREATE TABLE  `blob_index`.`metadata` (
      `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
      `fkBlob` int(10) unsigned NOT NULL,
      `start` datetime NOT NULL,
      PRIMARY KEY (`id`),
      KEY `fk_blob` (`fkBlob`),
      KEY `index_start` (`start`),
      CONSTRAINT `fk_blob` FOREIGN KEY (`fkBlob`) REFERENCES `blobstore` (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;


    DROP TABLE IF EXISTS `blob_index`.`blobstore`;
    CREATE TABLE  `blob_index`.`blobstore` (
      `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
      `data` mediumblob NOT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

原文

I have a table that has a foreign key to a table that stores some blob data. When I do an inner join on the tables with a condition on the main table the join type goes from 'index' to 'ALL'. I would like to avoid this as my blob table is on the order of tens of gigabytes. How can I avoid it?

Here is the the basic inner join:

EXPLAIN SELECT m.id, b.id, b.data 
        FROM metadata m, blobstore b 
        WHERE m.fkBlob = b.id;

1, 'SIMPLE', 'm', 'index', 'fk_blob', 'fk_blob', '4', '', 1, 'Using index'
1, 'SIMPLE', 'b', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'blob_index.m.fkBlob', 1, ''

Here I add a condition on the main table:

EXPLAIN SELECT m.id, b.id, b.data 
        FROM metadata m, blobstore b 
        WHERE m.fkBlob = b.id AND m.start < '2009-01-01';
1, 'SIMPLE', 'b', 'ALL', 'PRIMARY', '', '', '', 1, ''
1, 'SIMPLE', 'm', 'ref', 'fk_blob,index_start', 'fk_blob', '4', 'blob_index.b.id', 1, 'Using where'

Notice that the order in which the tables are listed has changed. It is now doing a full table scan on the blob table because of a condition I've added regarding the main table.

Here is the schema:

 DROP TABLE IF EXISTS `blob_index`.`metadata`;
    CREATE TABLE  `blob_index`.`metadata` (
      `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
      `fkBlob` int(10) unsigned NOT NULL,
      `start` datetime NOT NULL,
      PRIMARY KEY (`id`),
      KEY `fk_blob` (`fkBlob`),
      KEY `index_start` (`start`),
      CONSTRAINT `fk_blob` FOREIGN KEY (`fkBlob`) REFERENCES `blobstore` (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;


    DROP TABLE IF EXISTS `blob_index`.`blobstore`;
    CREATE TABLE  `blob_index`.`blobstore` (
      `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
      `data` mediumblob NOT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

脱离于你 2024-08-23 02:46:49

我猜你正在空表上尝试这个（因为MySQL认为它需要遍历一行来进行全表扫描），这可能会影响调度程序的结果。当您在真实的桌子上执行此操作时，EXPLAIN 结果可能会有所不同（实际上在我的测试中确实有所不同）。

回复收藏 0 原文

日记撕了你也走了 2024-08-23 02:46:49

优化器认为您的查询将从交换表顺序中受益（这很可能意味着统计信息不是最新的）。

您可以尝试在元数据（start，fkBlob）上添加索引：

CREATE INDEX ix_metadata_start_blob ON metadata (start, fkBlob)

并在两个表上运行ANALYZE TABLE。

这样，start 上的索引将用于过滤将成为前导的metadata。

您还可以显式强制 join: 的顺序

SELECT  *
FROM    metadata m
STRAIGHT_JOIN
        blobstore b
ON      b.id = m.fkBlob
WHERE   m.start <= '2009-01-01'

，但通常不建议这样做。

The optimizer thinks you query will benefit from swapping the table order (which most probably means that statistics are not up to date).

You can try adding an index on metadata (start, fkBlob):

CREATE INDEX ix_metadata_start_blob ON metadata (start, fkBlob)

and run ANALYZE TABLE on both tables.

This way, the index on start will be used for filtering on metadata which will be made leading.

You can also explicitly force the order of the join:

SELECT  *
FROM    metadata m
STRAIGHT_JOIN
        blobstore b
ON      b.id = m.fkBlob
WHERE   m.start <= '2009-01-01'

, though it's usually not recommended.

回复收藏 0 原文

貪欢 2024-08-23 02:46:49

如果我正确阅读了您发布的内容，它会从 index 到 ref 和 eq_ref 到 all。

CREATE INDEX idx_metadata USING BTREE ON `metadata` (fkBlob,start);

应该马上拿回来。

If I read what you posted right it goes from index to ref and eq_ref to all.

CREATE INDEX idx_metadata USING BTREE ON `metadata` (fkBlob,start);

Should take it right back.

回复收藏 0 原文

听风念你 2024-08-23 02:46:49

if the index doesnot take it right use HINTS

select /* INDEX <index_name> */
blah blah blah
from ........

if the index doesnot take it right use HINTS

select /* INDEX <index_name> */
blah blah blah
from ........

回复收藏 0 原文

不必在意 2024-08-23 02:46:49

在第一个示例中，MySQL 使用元数据 fk_blob 索引，因为它是覆盖索引——查询中使用的每一列都存在于索引中。（这就是“使用索引”的含义。）该查询仍然执行完整扫描，但它通过辅助索引而不是主索引扫描每一行。一旦你使用start，你就失去了覆盖索引，MySQL计算出使用blobstore作为驱动索引会更快。（InnoDB 的主索引与行存储集成在一起。）

如果您希望 MySQL 继续使用元数据索引作为驱动索引，请确保其上有一个对查询有用的索引。 (start, fkBlob) 上的索引最适合第二个查询，但对于其他查询可能没有用。下一个最佳索引是将 (fkBlob) 替换为 (fkBlob, start)。您必须在拥有太多索引（维护成本高昂）与拥有高效查询计划之间取得平衡。测试、测试、测试——永远不要盲目相信你的开发数据库上的解释。

回复收藏 0 原文

~没有更多了~