如何在不使用子查询的情况下从连接表中获取计数直方图?
我有很多看起来像这样的表:(id,user_id,object_id)。我经常对“有多少用户拥有一个对象?有多少用户有两个?等等”这一问题感兴趣。并想查看分布。
对此的明显答案如下所示:
select x.ucount, count(*)
from (select count(*) as ucount from objects_users group by user_id) as x
group by x.ucount
order by x.ucount;
这会产生如下结果:
ucount | count
-------|-------
1 | 15
2 | 17
3 | 23
4 | 104
5 | 76
7 | 12
在此处使用子查询对我来说感觉不优雅,我想弄清楚如何在不使用子查询的情况下产生相同的结果。此外,如果您要问的问题稍微复杂一些,那么从子查询中传递更多信息就会变得很混乱。例如,如果您希望数据按用户的创建日期进一步分组:
select
x.ucount,
(select cdate from users where id = x.user_id) as cdate,
count(*)
from (
select user_id, count(*) as ucount
from objects_users group by user_id
) as x
group by cdate, x.ucount,
order by cdate, x.ucount;
是否有某种方法可以避免子查询的爆炸?我想最终我的反对是出于美学目的,但这使得查询难以阅读和编写。
I have a lot of tables that look like this: (id, user_id, object_id). I am often interested in the question "how many users have one object? how many have two? etc." and would like to see the distribution.
The obvious answer to this looks like:
select x.ucount, count(*)
from (select count(*) as ucount from objects_users group by user_id) as x
group by x.ucount
order by x.ucount;
This produces results like:
ucount | count
-------|-------
1 | 15
2 | 17
3 | 23
4 | 104
5 | 76
7 | 12
Using a subquery here feels inelegant to me and I'd like to figure out how to produce the same result without. Further, if the question you're trying to ask is slightly more complicated it gets messy passing more information out of the subquery. For example, if you want the data further grouped by the user's creation date:
select
x.ucount,
(select cdate from users where id = x.user_id) as cdate,
count(*)
from (
select user_id, count(*) as ucount
from objects_users group by user_id
) as x
group by cdate, x.ucount,
order by cdate, x.ucount;
Is there some way to avoid the explosion of subqueries? I suppose in the end my objection is aesthetic, but it makes the queries hard to read and hard to write.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为无论您的 RDBMS 是什么,子查询都是执行此操作的正确方法。为什么会显得不优雅呢?
对于第二个查询,只需像这样加入
users
表:I think a subquery is exactly the appropriate way to do this, regardless of your RDBMS. Why would it be inelegant?
For the second query, just join the
users
table like this: