帮我优化这个查询
我对我正在设计的应用程序有这样的查询。有一个参考文献表、一个作者表和一个reference_authors 表。有一个子查询返回给定参考文献的所有作者,然后我将其显示为 php 格式。子查询和查询单独运行既漂亮又快速。然而,一旦将子查询放入主查询中,整个过程就需要 120 秒以上才能运行。我希望能有一些新的眼光来看待这个。 谢谢。
SELECT
rf.reference_id,
rf.reference_type_id,
rf.article_title,
rf.publication,
rf.annotation,
rf.publication_year,
(SELECT GROUP_CONCAT(a.author_name)
FROM authors_final AS a
INNER JOIN reference_authors AS ra2 ON ra2.author_id = a.author_id
WHERE ra2.reference_id = rf.reference_id
GROUP BY ra2.reference_id) AS authors
FROM
references_final AS rf
INNER JOIN reference_authors AS ra ON rf.reference_id = ra.reference_id
LEFT JOIN reference_institutes AS ri ON rf.reference_id = ri.reference_id;
这是固定查询。谢谢大家的建议。
SELECT
rf.reference_id,
rf.reference_type_id,
rf.article_title,
rf.publication,
rf.annotation,
rf.publication_year,
GROUP_CONCAT(a.author_name) AS authors
FROM
references_final as rf
INNER JOIN (reference_authors AS ra INNER JOIN authors_final AS a ON ra.author_id = a.author_id)
ON rf.reference_id = ra.reference_id
LEFT JOIN reference_institutes AS ri ON rf.reference_id = ri.reference_id
GROUP BY rf.reference_id
I have this query for an application that I am designing. There is a table of references, an authors table and a reference_authors table. There is a sub query to return all authors for a given reference which I then display formatted in php. The subquery and query run individually are both nice and speedy. However as soon as the subquery is put into the main query the whole thing takes over 120s to run. I would apprecaite some fresh eyes on this one.
Thanks.
SELECT
rf.reference_id,
rf.reference_type_id,
rf.article_title,
rf.publication,
rf.annotation,
rf.publication_year,
(SELECT GROUP_CONCAT(a.author_name)
FROM authors_final AS a
INNER JOIN reference_authors AS ra2 ON ra2.author_id = a.author_id
WHERE ra2.reference_id = rf.reference_id
GROUP BY ra2.reference_id) AS authors
FROM
references_final AS rf
INNER JOIN reference_authors AS ra ON rf.reference_id = ra.reference_id
LEFT JOIN reference_institutes AS ri ON rf.reference_id = ri.reference_id;
Here is the fixed query. Thanks guys for the recommendations.
SELECT
rf.reference_id,
rf.reference_type_id,
rf.article_title,
rf.publication,
rf.annotation,
rf.publication_year,
GROUP_CONCAT(a.author_name) AS authors
FROM
references_final as rf
INNER JOIN (reference_authors AS ra INNER JOIN authors_final AS a ON ra.author_id = a.author_id)
ON rf.reference_id = ra.reference_id
LEFT JOIN reference_institutes AS ri ON rf.reference_id = ri.reference_id
GROUP BY rf.reference_id
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
虽然不是每个子查询都可以重写为内部联接,但我认为您可以。
从 120 秒缩短到 78 毫秒,这是一个不错的改进,大约提高了三个数量级。这天剩下的时间请假。
当您明天回来时,开始在源代码中查找其他子查询。
Although not every subquery can be rewritten as an inner join, I think yours can.
From 120 seconds to 78 milliseconds is not a bad improvement--about three orders of magnitude. Take the rest of the day off.
When you come back tomorrow, start looking for other subqueries in your source code.
您说子查询在隔离中很好而且速度很快,但它现在显然是针对每一行运行的 - 100 行 = 100 个子查询。
假设您的所有外键都有索引,这与子查询一样好。
一种选择是左连接作者并创建笛卡尔积 - 您将返回更多行,并且需要一些代码才能获得相同的最终结果,但它会给数据库带来更少的压力并且运行速度更快。
如果您已开启分页并返回 10 行,那么发出 10 次单独调用来隔离作者也会非常快。
You say the subquery is nice and speedy in isolation but its now obviously running for every single row - 100 rows = 100 sub queries.
Assuming you have indexes on all your foreign keys that's as good as it gets as a sub query.
One option is to left join authors and create a Cartesian product - you'll have a lot more rows returned and will need some code to get to the same end result but it will put less strain on the db and will run quicker.
If you've got paging on and say are returning 10 rows, issung 10 individual calls to get the authors in isolation would also be be pretty quick.