使用ratio_to_report分析
我试图获取一组特定值所具有的行的百分比。最好通过例子来解释。我可以使用ratio-to-report函数和over()非常简单地通过每一列来完成此操作,但是我遇到了多个分组的问题
假设表有2列:
column a column b
1000 some data
1100 some data
2000 some data
1400 some data
1500 some data
通过以下查询,我可以得到这个域集,每个都是总行数的 20%
select columna, count(*), trunc(ratio_to_report(count(columna)) over() * 100, 2) as perc
from table
group by columna
order by perc desc;
但是,我需要的是例如确定百分比和包含 1000、1400 或 2000 的行数;通过查看它,您可以看出它是 60%,但需要一个查询来返回该值。这需要高效,因为查询将针对数百万行运行。就像我之前说的,我对单个值及其百分比进行了处理,但倍数才是让我困惑的。
似乎我需要能够在某处放置 IN 子句,但值不会每次都是这些特定值。如果有意义的话,我需要从另一个表中获取“IN”部分的值。我想我需要某种多重分组。
I am trying to get the percentage of rows that a set of particular value has. Best explained by example. I can do this by each column very simply using ratio-to-report function and over(), but am having issues with multiple groupings
Assume table has 2 columns:
column a column b
1000 some data
1100 some data
2000 some data
1400 some data
1500 some data
With the following query, I can get for this domain set, each one is 20% of the total rows
select columna, count(*), trunc(ratio_to_report(count(columna)) over() * 100, 2) as perc
from table
group by columna
order by perc desc;
However, what I need is for example to determine the percentage & count of the rows that contain 1000, 1400 or 2000; From looking at it, you can tell its 60%, but need a query to return that. This needs to be efficient, as the query will be running against millions of rows. Like I said before, I have this working on a single value and its percentage, but the multiple is what is throwing me.
Seems like I need to be able to put an IN clause somewhere, but the values will not be these specific values each time. I will need to get the values for the "IN" part of it from another table, if that makes sense. guess I need some kind of multiple grouping.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可能正在寻找类似的东西
Potentially, you're looking for something like
我不确定我是否完全理解该要求,但您是否需要
ratio_to_report
?看看下面的内容,让我知道这与您想要的有多接近,我们可以从那里开始工作!T1 是包含示例数据的表
T2 是您提到的查找表(在其中获取 ID 列表)
T1->T2 的左连接将返回 T1 中的所有行与 T2 中的所有匹配行配对。对于 T1 中不存在于集合 (T2) 中的每个 A,结果将用 NULL 填充。我们可以利用
COUNT()
不计算(呵呵)空值的事实。运行查询的结果是:
I'm not sure I entirely understand the requirement, but do you need
ratio_to_report
at all? Have a look at the following, and let me know how close this is to what you want, and we can work from there!T1 is the table containing your sample data
T2 is the lookup table you mentioned (where you get the list of IDs)
A left join from T1->T2 will return all rows in T1 paired with all matching rows in T2. For each A in T1 that does not exist in your set (T2), the result will be padded with NULL. We can exploit the fact that
COUNT()
doesn't count (hehe) nulls.The result of running the query is: