使用 GROUP BY 查询计算百分比
我有一个包含 3 列的表,如下所示:
File User Rating (1-5)
------------------------------
00001 1 3
00002 1 4
00003 2 2
00004 3 5
00005 4 3
00005 3 2
00006 2 3
Etc.
我想生成一个输出以下内容的查询(对于每个用户和评级,显示文件数量以及文件百分比):
User Rating Count Percentage
-----------------------------------
1 1 3 .18
1 2 6 .35
1 3 8 .47
2 5 12 .75
2 3 4 .25
使用 Postgresql,我知道如何创建使用以下查询包含前 3 列的查询,但我无法弄清楚如何计算 GROUP BY 内的百分比:
SELECT
User,
Rating,
Count(*)
FROM
Results
GROUP BY
User, Rating
ORDER BY
User, Rating
这里我希望将百分比计算应用于每个用户/评级组。
I have a table with 3 columns which looks like this:
File User Rating (1-5)
------------------------------
00001 1 3
00002 1 4
00003 2 2
00004 3 5
00005 4 3
00005 3 2
00006 2 3
Etc.
I want to generate a query that outputs the following (for each user and rating, display the number of files as well as percentage of files):
User Rating Count Percentage
-----------------------------------
1 1 3 .18
1 2 6 .35
1 3 8 .47
2 5 12 .75
2 3 4 .25
With Postgresql, I know how to create a query that includes the first 3 columns using the following query, but I can't figure out how to calculate percentage within the GROUP BY:
SELECT
User,
Rating,
Count(*)
FROM
Results
GROUP BY
User, Rating
ORDER BY
User, Rating
Here I want the percentage calculation to apply to each user/rating group.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
您可以在聚合函数之上使用分析函数:
查询将在任何实现分析(窗口)函数的 RDBMS 中运行。请注意,聚合函数嵌套在分析函数内。这是由于 SQL 中的操作顺序而起作用的。
DB 演示
You can use analytic functions on top of aggregate functions:
The query will work in any RDBMS that implements analytic (window) functions. Notice that aggregate functions are nested inside analytic functions. This works because of the order of operations in SQL.
Demo on DB<>Fiddle
或者
我会看看其中之一是否可以使用适合您的 RDBMS 的工具产生更好的查询计划。
Or
I would see if one of these or the other yields a better query plan with the appropriate tool for your RDBMS.
或者,你也可以采用老式的方式——可以说更容易理解:
干杯!
Alternatively, you can do the old-school way — arguably easier to grok:
Cheers!
在 TSQL 中这应该可以工作
In TSQL this should work
最好的方法是使用窗口函数。
The best way to do this would be with window functions.
如果数据不是太大,您可以考虑使用子查询以方便使用。确保子查询查询相同的数据。
我在一个类似的表上对此进行了测试
这里与 1.0 相乘的摆弄是为了确保我们切换到基于浮点的算术,这样我们的百分比不会由于整数除法而成为 0 或 1。
使用分区函数或窗口函数可能比子查询更快,但语法使用更知名的 SQL 结构。评级列是 T-SQL (SQL Server) 中的保留字,因此我必须使用 [Rating],在 Postgres 中这不是必需的。
If data is not too big, you might consider subqueries for ease of use. Make sure subquery queries same data.
I tested this on a similar table
The fiddling with multiplying with 1.0 here is to make sure we switch to floating based arithmetic so our percentages are not 0 or 1 due to integer division.
Using partitions over or windowed functions are probably faster than subqueries, but syntax is using more well-known constructs of SQL. The Rating column is a reserved word in T-SQL (SQL Server) so I had to use [Rating], in Postgres this is not required.