使用子选择构建此 MySQL 语句的更好方法

发布于 2024-08-26 12:29:07 字数 841 浏览 1 评论 0原文

我的数据库中有五个表。成员、项目、评论、投票和国家/地区。我想要获得 10 件物品。我想获得每个项目的评论和投票数。我还想要提交每个项目的成员以及他们来自的国家/地区。

在此处和其他地方发布后,我开始使用子选择来获取计数,但此查询需要 10 秒或更长时间!

SELECT `items_2`.*, 
   (SELECT COUNT(*) 
   FROM `comments` 
   WHERE (comments.Script = items_2.Id) 
   AND (comments.Active = 1)) 
  AS `Comments`, 
   (SELECT COUNT(votes.Member) 
   FROM `votes` 
   WHERE (votes.Script = items_2.Id) 
   AND (votes.Active = 1)) 
  AS `votes`, 
  `countrys`.`Name` AS `Country` 
FROM `items` AS `items_2` 
INNER JOIN `members` ON items_2.Member=members.Id AND members.Active = 1 
INNER JOIN `members` AS `members_2` ON items_2.Member=members.Id 
LEFT JOIN `countrys` ON countrys.Id = members.Country 
GROUP BY `items_2`.`Id` 
ORDER BY `Created` DESC 
LIMIT 10

我的问题是,这是否是正确的方法,是否有更好的方法来编写此语句,或者是否有一种完全不同的方法会更好。我应该单独运行子选择并聚合信息吗?

I have five tables in my database. Members, items, comments, votes and countries. I want to get 10 items. I want to get the count of comments and votes for each item. I also want the member that submitted each item, and the country they are from.

After posting here and elsewhere, I started using subselects to get the counts, but this query is taking 10 seconds or more!

SELECT `items_2`.*, 
   (SELECT COUNT(*) 
   FROM `comments` 
   WHERE (comments.Script = items_2.Id) 
   AND (comments.Active = 1)) 
  AS `Comments`, 
   (SELECT COUNT(votes.Member) 
   FROM `votes` 
   WHERE (votes.Script = items_2.Id) 
   AND (votes.Active = 1)) 
  AS `votes`, 
  `countrys`.`Name` AS `Country` 
FROM `items` AS `items_2` 
INNER JOIN `members` ON items_2.Member=members.Id AND members.Active = 1 
INNER JOIN `members` AS `members_2` ON items_2.Member=members.Id 
LEFT JOIN `countrys` ON countrys.Id = members.Country 
GROUP BY `items_2`.`Id` 
ORDER BY `Created` DESC 
LIMIT 10

My question is whether this is the right way to do this, if there's better way to write this statement OR if there's a whole different approach that will be better. Should I run the subselects separately and aggregate the information?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

别靠近我心 2024-09-02 12:29:07

是的,您可以将子查询重写为聚合联接(见下文),但我几乎可以肯定,速度缓慢是由于缺少索引而不是查询本身造成的。使用 EXPLAIN 查看内容您可以添加索引以使查询在几分之一秒内运行。

作为记录,这里是等效的聚合连接。

SELECT `items_2`.*,
  c.cnt AS `Comments`,
  v.cnt AS `votes`,
  `countrys`.`Name` AS `Country` 
FROM `items` AS `items_2` 
INNER JOIN `members` ON items_2.Member=members.Id AND members.Active = 1 
INNER JOIN `members` AS `members_2` ON items_2.Member=members.Id 
LEFT JOIN (
  SELECT Script, COUNT(*) AS cnt 
   FROM `comments` 
   WHERE Active = 1
   GROUP BY Script
) AS c
ON c.Script = items_2.Id 
LEFT JOIN ( 
  SELECT votes.Script, COUNT(*) AS cnt 
   FROM `votes` 
   WHERE Active = 1
   GROUP BY Script
) AS v
ON v.Script = items_2.Id 
LEFT JOIN `countrys` ON countrys.Id = members.Country 
GROUP BY `items_2`.`Id` 
ORDER BY `Created` DESC 
LIMIT 10

但是,因为您使用的是LIMIT 10,所以几乎可以肯定您当前拥有的子查询比我提供的等效聚合连接效果更好(或者更好)以上供参考。

这是因为,在聚合连接查询的情况下,糟糕的优化器(MySQL 的优化器远非出色)最终可能会对 的全部内容执行 COUNT(*) 聚合工作。 >CommentsVotes 表,然后浪费地丢弃除 10 个值(您的 LIMIT)之外的所有内容,而在您的原始查询的情况下,它会从一开始就丢弃,仅查看 CommentsVotes 表中严格的最小值。

更准确地说,以原始查询的方式使用子查询通常会产生所谓的 带有索引查找的嵌套循环。使用聚合联接通常会导致合并哈希连接与索引扫描或表扫描。当循环数量较小(在您的情况下为 10)时,前者(嵌套循环)比后者(合并和散列连接)更有效。但是,当前者导致太多循环时,后者会变得更高效(数万/数十万或更多),尤其是在磁盘速度较慢但内存较多的系统上。

Yes, you can rewrite the subqueries as aggregate joins (see below), but I am almost certain that the slowness is due to missing indices rather than to the query itself. Use EXPLAIN to see what indices you can add to make your query run in a fraction of a second.

For the record, here is the aggregate join equivalent.

SELECT `items_2`.*,
  c.cnt AS `Comments`,
  v.cnt AS `votes`,
  `countrys`.`Name` AS `Country` 
FROM `items` AS `items_2` 
INNER JOIN `members` ON items_2.Member=members.Id AND members.Active = 1 
INNER JOIN `members` AS `members_2` ON items_2.Member=members.Id 
LEFT JOIN (
  SELECT Script, COUNT(*) AS cnt 
   FROM `comments` 
   WHERE Active = 1
   GROUP BY Script
) AS c
ON c.Script = items_2.Id 
LEFT JOIN ( 
  SELECT votes.Script, COUNT(*) AS cnt 
   FROM `votes` 
   WHERE Active = 1
   GROUP BY Script
) AS v
ON v.Script = items_2.Id 
LEFT JOIN `countrys` ON countrys.Id = members.Country 
GROUP BY `items_2`.`Id` 
ORDER BY `Created` DESC 
LIMIT 10

However, because you are using LIMIT 10, you are almost certainly as well off (or better off) with the subqueries that you currently have than with the aggregate join equivalent I provided above for reference.

This is because a bad optimizer (and MySQL's is far from stellar) could, in the case of the aggregate join query, end up performing the COUNT(*) aggregation work for the full contents of the Comments and Votes table before wastefully throwing everything but 10 values (your LIMIT) away, whereas in the case of your original query it will, from the start, only look at the strict minimum as far as the Comments and Votes tables are concerned.

More precisely, using subqueries in the way that your original query does typically results in what is called nested loops with index lookups. Using aggregate joins typically results in merge or hash joins with index scans or table scans. The former (nested loops) are more efficient than the latter (merge and hash joins) when the number of loops is small (10 in your case.) The latter, however, get more efficient when the former would result in too many loops (tens/hundreds of thousands or more), especially on systems with slow disks but lots of memory.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文