如何让子查询更快
对于作者概述,我们正在寻找一个查询,该查询将显示所有作者,包括他们最好的书。此查询的问题是速度不够。只有大约 1500 位作者,生成概述的查询目前需要 20 秒。
主要问题似乎是生成每个人对所有书籍的平均评分。 通过选择以下查询,它仍然相当快。
select
person.id as pers_id,
person.firstname,
person.suffix,
person.lastname,
thriller.title,
year(thriller.orig_pubdate) as year,
thriller.id as thrill_id,
count(user_rating.id) as nr,
AVG(user_rating.rating) as avgrating
from
thriller
inner join
thriller_form
on thriller_form.thriller_id = thriller.id
inner join
thriller_person
on thriller_person.thriller_id = thriller.id
and thriller_person.person_type_id = 1
inner join
person
on person.id = thriller_person.person_id
left outer join
user_rating
on user_rating.thriller_id = thriller.id
and user_rating.rating_type_id = 1
where thriller.id in
(select top 1 B.id from thriller as B
inner join thriller_person as C on B.id=C.thriller_id
and person.id=C.person_id)
group by
person.firstname,
person.suffix,
person.lastname,
thriller.title,
year(thriller.orig_pubdate),
thriller.id,
person.id
order by
person.lastname
但是,如果我们通过选择具有平均评分的书籍来使子查询变得更复杂一点,则需要整整 20 秒才能生成结果集。 查询将如下所示:
select
person.id as pers_id,
person.firstname,
person.suffix,
person.lastname,
thriller.title,
year(thriller.orig_pubdate) as year,
thriller.id as thrill_id,
count(user_rating.id) as nr,
AVG(user_rating.rating) as avgrating
from
thriller
inner join
thriller_form
on thriller_form.thriller_id = thriller.id
inner join
thriller_person
on thriller_person.thriller_id = thriller.id
and thriller_person.person_type_id = 1
inner join
person
on person.id = thriller_person.person_id
left outer join
user_rating
on user_rating.thriller_id = thriller.id
and user_rating.rating_type_id = 1
where thriller.id in
(select top 1 B.id from thriller as B
inner join thriller_person as C on B.id=C.thriller_id
and person.id=C.person_id
inner join user_rating as D on B.id=D.thriller_id
group by B.id
order by AVG(D.rating))
group by
person.firstname,
person.suffix,
person.lastname,
thriller.title,
year(thriller.orig_pubdate),
thriller.id,
person.id
order by
person.lastname
有人对加快此查询有好的建议吗?
for an author overview we are looking for a query which will show all the authors including their best book. The problem with this query is that it lacks speed. There are only about 1500 authors and the query do generate the overview is currently taking 20 seconds.
The main problem seems te be generating the average rating of all the books per person.
By selecting the following query, it is still rather fast
select
person.id as pers_id,
person.firstname,
person.suffix,
person.lastname,
thriller.title,
year(thriller.orig_pubdate) as year,
thriller.id as thrill_id,
count(user_rating.id) as nr,
AVG(user_rating.rating) as avgrating
from
thriller
inner join
thriller_form
on thriller_form.thriller_id = thriller.id
inner join
thriller_person
on thriller_person.thriller_id = thriller.id
and thriller_person.person_type_id = 1
inner join
person
on person.id = thriller_person.person_id
left outer join
user_rating
on user_rating.thriller_id = thriller.id
and user_rating.rating_type_id = 1
where thriller.id in
(select top 1 B.id from thriller as B
inner join thriller_person as C on B.id=C.thriller_id
and person.id=C.person_id)
group by
person.firstname,
person.suffix,
person.lastname,
thriller.title,
year(thriller.orig_pubdate),
thriller.id,
person.id
order by
person.lastname
However, if we make the subquery a little more complex by selecting the book with the average rating it takes a full 20 seconds to generate a resultset.
The query would then be as follows:
select
person.id as pers_id,
person.firstname,
person.suffix,
person.lastname,
thriller.title,
year(thriller.orig_pubdate) as year,
thriller.id as thrill_id,
count(user_rating.id) as nr,
AVG(user_rating.rating) as avgrating
from
thriller
inner join
thriller_form
on thriller_form.thriller_id = thriller.id
inner join
thriller_person
on thriller_person.thriller_id = thriller.id
and thriller_person.person_type_id = 1
inner join
person
on person.id = thriller_person.person_id
left outer join
user_rating
on user_rating.thriller_id = thriller.id
and user_rating.rating_type_id = 1
where thriller.id in
(select top 1 B.id from thriller as B
inner join thriller_person as C on B.id=C.thriller_id
and person.id=C.person_id
inner join user_rating as D on B.id=D.thriller_id
group by B.id
order by AVG(D.rating))
group by
person.firstname,
person.suffix,
person.lastname,
thriller.title,
year(thriller.orig_pubdate),
thriller.id,
person.id
order by
person.lastname
Anyone got a good suggestion to speed up this query?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
计算平均值需要进行表扫描,因为您必须对值进行求和,然后除以(相关)行数。这反过来意味着您需要进行大量的重新扫描;那很慢。你能计算一次平均值并存储它们吗?这将使您的查询使用这些预先计算的值。 (是的,它对数据进行非规范化,但对性能进行非规范化通常是必要的;性能和最少数据之间需要进行权衡。)
使用临时表作为平均值的存储可能是合适的。
Calculating an average requires a table scan since you've got to sum the values and then divide by the number of (relevant) rows. This in turn means that you're doing a lot of rescanning; that's slow. Can you calculate the averages once and store them? That would let your query use those pre-computed values. (Yes, it denormalizes the data, but denormalizing for performance is often necessary; there's a trade-off between performance and minimal data.)
It might be appropriate to use a temporary table as the store of the averages.