有人愿意帮助优化 MySQL 查询吗?

发布于 2024-07-26 11:09:19 字数 2684 浏览 2 评论 0原文

查询如下:

SELECT COUNT(*) AS c, MAX(`followers_count`) AS max_fc, 
       MIN(`followers_count`) AS min_fc, MAX(`following_count`) AS max_fgc,
       MIN(`following_count`) AS min_fgc, SUM(`followers_count`) AS fc,
       SUM(`following_count`) AS fgc, MAX(`updates_count`) AS max_uc,
       MIN(`updates_count`) AS min_uc, SUM(`updates_count`) AS uc
FROM `profiles`
WHERE `twitter_id` IN (SELECT `followed_by` 
                       FROM `relations` 
                       WHERE `twitter_id` = 123);

两个表是 profilesrelations。 两者都拥有超过 1,000,000 行的 InnoDB 引擎。 两者都在 twitter_id 上有索引,relations 在 (twitter_id, followed_by) 上有一个额外的索引。 查询执行时间超过 6 秒,这真的让我很沮丧。 我知道我可以以某种方式加入这个,但我的 MySQL 知识不太好,这就是为什么我请求你的帮助。

预先感谢大家=)

干杯, K ~

已更新

好的,我设法将时间缩短到 2.5 秒。 我使用 INNER JOIN 并添加了三个索引对。 这是解释结果:

id, select_type, table, type, possible_keys, 
    key, key_len, ref, rows, Extra

1, 'SIMPLE', 'r', 'ref', 'relation', 
    'relation', '4', 'const', 252310, 'Using index'

1, 'SIMPLE', 'p', 'ref', 'PRIMARY,twiter_id,id_fc,id_fgc,id_uc', 
    'id_uc', '4', 'follerme.r.followed_by', 1, ''

希望这有帮助。

另一个更新

这是两个表的 SHOW CREATE TABLE 语句:

CREATE TABLE `profiles` (
  `twitter_id` int(10) unsigned NOT NULL,
  `screen_name` varchar(45) NOT NULL default '',
  `followers_count` int(10) unsigned default NULL,
  `following_count` int(10) unsigned default NULL,
  `updates_count` int(10) unsigned default NULL,
  `location` varchar(45) default NULL,
  `bio` varchar(160) default NULL,
  `url` varchar(255) default NULL,
  `image` varchar(255) default NULL,
  `registered` int(10) unsigned default NULL,
  `timestamp` int(10) unsigned default NULL,
  `relations_timestamp` int(10) unsigned default NULL,
  PRIMARY KEY  USING BTREE (`twitter_id`,`screen_name`),
  KEY `twiter_id` (`twitter_id`),
  KEY `screen_name` USING BTREE (`screen_name`,`twitter_id`),
  KEY `id_fc` (`twitter_id`,`followers_count`),
  KEY `id_fgc` (`twitter_id`,`following_count`),
  KEY `id_uc` (`twitter_id`,`updates_count`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

CREATE TABLE `relations` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `twitter_id` int(10) unsigned NOT NULL default '0',
  `followed_by` int(10) unsigned default NULL,
  `timestamp` int(10) unsigned default NULL,
  PRIMARY KEY  USING BTREE (`id`,`twitter_id`),
  UNIQUE KEY `relation` (`twitter_id`,`followed_by`)
) ENGINE=InnoDB AUTO_INCREMENT=1209557 DEFAULT CHARSET=utf8

哇,真是一团糟 =) 抱歉!

Here's the query:

SELECT COUNT(*) AS c, MAX(`followers_count`) AS max_fc, 
       MIN(`followers_count`) AS min_fc, MAX(`following_count`) AS max_fgc,
       MIN(`following_count`) AS min_fgc, SUM(`followers_count`) AS fc,
       SUM(`following_count`) AS fgc, MAX(`updates_count`) AS max_uc,
       MIN(`updates_count`) AS min_uc, SUM(`updates_count`) AS uc
FROM `profiles`
WHERE `twitter_id` IN (SELECT `followed_by` 
                       FROM `relations` 
                       WHERE `twitter_id` = 123);

The two tables are profiles and relations. Both have over 1,000,000 rows, InnoDB engine. Both have indexes on twitter_id, relations has an extra index on (twitter_id, followed_by). The query is taking over 6 seconds to execute, this really frustrates me. I know that I can JOIN this somehow, but my MySQL knowledge is not so cool, that's why I'm asking for your help.

Thanks in advance everyone =)

Cheers,
K ~

Updated

Okay I managed to get down to 2,5 seconds. I used INNER JOIN and added the three index pairs. Here's the EXPLAIN results:

id, select_type, table, type, possible_keys, 
    key, key_len, ref, rows, Extra

1, 'SIMPLE', 'r', 'ref', 'relation', 
    'relation', '4', 'const', 252310, 'Using index'

1, 'SIMPLE', 'p', 'ref', 'PRIMARY,twiter_id,id_fc,id_fgc,id_uc', 
    'id_uc', '4', 'follerme.r.followed_by', 1, ''

Hope this helps.

Another update

Here are the SHOW CREATE TABLE statements for both tables:

CREATE TABLE `profiles` (
  `twitter_id` int(10) unsigned NOT NULL,
  `screen_name` varchar(45) NOT NULL default '',
  `followers_count` int(10) unsigned default NULL,
  `following_count` int(10) unsigned default NULL,
  `updates_count` int(10) unsigned default NULL,
  `location` varchar(45) default NULL,
  `bio` varchar(160) default NULL,
  `url` varchar(255) default NULL,
  `image` varchar(255) default NULL,
  `registered` int(10) unsigned default NULL,
  `timestamp` int(10) unsigned default NULL,
  `relations_timestamp` int(10) unsigned default NULL,
  PRIMARY KEY  USING BTREE (`twitter_id`,`screen_name`),
  KEY `twiter_id` (`twitter_id`),
  KEY `screen_name` USING BTREE (`screen_name`,`twitter_id`),
  KEY `id_fc` (`twitter_id`,`followers_count`),
  KEY `id_fgc` (`twitter_id`,`following_count`),
  KEY `id_uc` (`twitter_id`,`updates_count`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

CREATE TABLE `relations` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `twitter_id` int(10) unsigned NOT NULL default '0',
  `followed_by` int(10) unsigned default NULL,
  `timestamp` int(10) unsigned default NULL,
  PRIMARY KEY  USING BTREE (`id`,`twitter_id`),
  UNIQUE KEY `relation` (`twitter_id`,`followed_by`)
) ENGINE=InnoDB AUTO_INCREMENT=1209557 DEFAULT CHARSET=utf8

Wow, what a mess =) Sorry!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

恰似旧人归 2024-08-02 11:09:19

连接看起来像这样:

SELECT COUNT(*) AS c,
MAX(p.`followers_count`) AS max_fc,
MIN(p.`followers_count`) AS min_fc,
MAX(p.`following_count`) AS max_fgc,
MIN(p.`following_count`) AS min_fgc,
SUM(p.`followers_count`) AS fc,
SUM(p.`following_count`) AS fgc,
MAX(p.`updates_count`) AS max_uc,
MIN(p.`updates_count`) AS min_uc,
SUM(p.`updates_count`) AS uc
FROM `profiles` AS p
INNER JOIN `relations` AS r ON p.`twitter_id` = r.`followed_by`
WHERE r.`twitter_id` = 123;

为了帮助优化它,您应该在两个查询上运行 EXPLAIN SELECT ...。

A join would look something like this:

SELECT COUNT(*) AS c,
MAX(p.`followers_count`) AS max_fc,
MIN(p.`followers_count`) AS min_fc,
MAX(p.`following_count`) AS max_fgc,
MIN(p.`following_count`) AS min_fgc,
SUM(p.`followers_count`) AS fc,
SUM(p.`following_count`) AS fgc,
MAX(p.`updates_count`) AS max_uc,
MIN(p.`updates_count`) AS min_uc,
SUM(p.`updates_count`) AS uc
FROM `profiles` AS p
INNER JOIN `relations` AS r ON p.`twitter_id` = r.`followed_by`
WHERE r.`twitter_id` = 123;

To help optimize it you should run EXPLAIN SELECT ... on both queries.

呆萌少年 2024-08-02 11:09:19

创建以下复合索引:

profiles (twitter_id, followers_count)
profiles (twitter_id, following_count)
profiles (twitter_id, updates_count)

并发布查询计划,看在上帝的份上。

顺便问一下,这个 COUNT(*) 返回了多少行?

更新:

您的表格行很长。 在您选择的所有字段上创建复合索引:

profiles (twitter_id, followers_count, following_count, updates_count)

以便 JOIN 查询可以从该索引检索所需的所有值。

Create the following composite indexes:

profiles (twitter_id, followers_count)
profiles (twitter_id, following_count)
profiles (twitter_id, updates_count)

and post the query plan, for God's sake.

By the way, how many rows does this COUNT(*) return?

Update:

Your table rows are quite long. Create a composite index on all the fields you select:

profiles (twitter_id, followers_count, following_count, updates_count)

so that the JOIN query can retrieve all the values it need from that index.

也只是曾经 2024-08-02 11:09:19
SELECT COUNT(*) AS c,
  MAX(`followers_count`) AS max_fc, MIN(`followers_count`) AS min_fc,
  MAX(`following_count`) AS max_fgc, MIN(`following_count`) AS min_fgc,
  SUM(`followers_count`) AS fc, SUM(`following_count`) AS fgc,
  MAX(`updates_count`) AS max_uc, MIN(`updates_count`) AS min_uc, SUM(`updates_count`) AS uc
FROM `profiles`
JOIN `relations`
  ON (profiles.twitter_id = relations.followed_by)
WHERE relations.twitted_id = 123;

可能会快一点,但您需要测量并检查是否确实如此。

SELECT COUNT(*) AS c,
  MAX(`followers_count`) AS max_fc, MIN(`followers_count`) AS min_fc,
  MAX(`following_count`) AS max_fgc, MIN(`following_count`) AS min_fgc,
  SUM(`followers_count`) AS fc, SUM(`following_count`) AS fgc,
  MAX(`updates_count`) AS max_uc, MIN(`updates_count`) AS min_uc, SUM(`updates_count`) AS uc
FROM `profiles`
JOIN `relations`
  ON (profiles.twitter_id = relations.followed_by)
WHERE relations.twitted_id = 123;

might be a bit faster, but you'll need to measure and check if that is indeed so.

爱人如己 2024-08-02 11:09:19

count(*) 在 InnoDB 引擎下是一个非常昂贵的操作,您是否在没有那部分的情况下尝试过这个查询? 如果它导致最多的处理时间,那么也许您可以保留一个运行值,而不是每次都查询它。

count(*) is a very expensive operation under the InnoDB Engine, have you tried this query without that piece? If it's causing the most processing time then maybe you could keep a running value instead of querying for it each time.

懷念過去 2024-08-02 11:09:19

我会从程序员的角度来解决这个问题; 我有一个单独的表(或某处的存储区域),用于存储与原始查询中每个字段关联的最大值、最小值和总和值,并在每次更新和添加表记录时更新这些值。 (尽管如果处理不当,删除可能会出现问题)。

填充这些值的原始查询完成后(这与您发布的查询几乎相同),您实际上是将最终查询减少为从数据表中获取一行,而不是一次计算所有内容。

I'd approach this problem from a programmers angle; I'd have a separate table (or storage area somewhere) that stored the max,min and sum values associated with each field in your original query and update those values every time I updated and added a table record. (although deleting may be problematic if not handled correctly).

After the original query to populate these values is complete (which is the almost the same as the query you posted), you're essentially reducing your final query to getting one row from a data table, rather than computing everything all at once.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文