一种仅允许一个唯一输入的聚合函数
我经常发现自己在 group by
子句中添加了我确信是唯一的表达式。有时事实证明我错了——因为我的 SQL 中存在错误或错误的假设,并且该表达式并不是真正唯一的。
在很多情况下,我宁愿这会生成 SQL 错误,也不愿默默地(有时非常巧妙地)扩展我的结果集。
我希望能够做类似的事情:
select product_id, unique description from product group by product_id
但显然我自己无法实现这一点 - 但可以使用某些数据库上的用户定义的聚合来实现几乎同样简洁的东西。
一种只允许一个唯一输入值的特殊聚合通常对所有版本的 SQL 都有帮助吗?如果是这样,这样的事情现在可以在大多数数据库上实现吗? null
值应该像任何其他值一样被考虑 - 与内置聚合 avg
通常的工作方式不同。 (我已经添加了针对 postgres 和 Oracle 实现此方法的答案。)
以下示例旨在展示如何使用聚合,但这是一个简单的情况,其中很明显哪些表达式应该是唯一的。真正的使用更有可能是在较大的查询中,其中更容易对唯一性
表做出错误的假设:
product_id | description
------------+-------------
1 | anvil
2 | brick
3 | clay
4 | door
sale_id | product_id | cost
---------+------------+---------
1 | 1 | £100.00
2 | 1 | £101.00
3 | 1 | £102.00
4 | 2 | £3.00
5 | 2 | £3.00
6 | 2 | £3.00
7 | 3 | £24.00
8 | 3 | £25.00
查询:
> select * from product join sale using (product_id);
product_id | description | sale_id | cost
------------+-------------+---------+---------
1 | anvil | 1 | £100.00
1 | anvil | 2 | £101.00
1 | anvil | 3 | £102.00
2 | brick | 4 | £3.00
2 | brick | 5 | £3.00
2 | brick | 6 | £3.00
3 | clay | 7 | £24.00
3 | clay | 8 | £25.00
> select product_id, description, sum(cost)
from product join sale using (product_id)
group by product_id, description;
product_id | description | sum
------------+-------------+---------
2 | brick | £9.00
1 | anvil | £303.00
3 | clay | £49.00
> select product_id, solo(description), sum(cost)
from product join sale using (product_id)
group by product_id;
product_id | solo | sum
------------+-------+---------
1 | anvil | £303.00
3 | clay | £49.00
2 | brick | £9.00
错误情况:
> select solo(description) from product;
ERROR: This aggregate only allows one unique input
I often find myself adding expressions in the group by
clause that I am sure are unique. It sometimes turns out I am wrong - because of an error in my SQL or a mistaken assumption, and that expression is not really unique.
There are many cases when I would much rather this would generate a SQL error rather than expanding my result set silently and sometimes very subtly.
I would love to be able to do something like:
select product_id, unique description from product group by product_id
but obviously I can't implement that myself - but something nearly as concise can be implemented with user defined aggregates on some databases.
Would a special aggregate that only allows one unique input value be generally helpful in all versions of SQL? If so, could such a thing be implemented now on most databases? null
values should be considered just like any other value - unlike the way the built-in aggregate avg
typically works. (I have added answers with ways of implementing this for postgres and Oracle.)
The following example is intended to show how the aggregate would be used, but is a simple case where it is obvious which expressions should be unique. Real usage would more likely be in larger queries where it is easier to make mistaken assumptions about uniqueness
tables:
product_id | description
------------+-------------
1 | anvil
2 | brick
3 | clay
4 | door
sale_id | product_id | cost
---------+------------+---------
1 | 1 | £100.00
2 | 1 | £101.00
3 | 1 | £102.00
4 | 2 | £3.00
5 | 2 | £3.00
6 | 2 | £3.00
7 | 3 | £24.00
8 | 3 | £25.00
queries:
> select * from product join sale using (product_id);
product_id | description | sale_id | cost
------------+-------------+---------+---------
1 | anvil | 1 | £100.00
1 | anvil | 2 | £101.00
1 | anvil | 3 | £102.00
2 | brick | 4 | £3.00
2 | brick | 5 | £3.00
2 | brick | 6 | £3.00
3 | clay | 7 | £24.00
3 | clay | 8 | £25.00
> select product_id, description, sum(cost)
from product join sale using (product_id)
group by product_id, description;
product_id | description | sum
------------+-------------+---------
2 | brick | £9.00
1 | anvil | £303.00
3 | clay | £49.00
> select product_id, solo(description), sum(cost)
from product join sale using (product_id)
group by product_id;
product_id | solo | sum
------------+-------+---------
1 | anvil | £303.00
3 | clay | £49.00
2 | brick | £9.00
error case:
> select solo(description) from product;
ERROR: This aggregate only allows one unique input
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
ORACLE 解决方案是,
您可以使用一个简单的函数,而不是 to_char(1/0) [这会引发 DIVIDE_BY_ZERO 错误),
您可以使用用户定义的聚合,但我担心在之间切换对性能的影响SQL 和 PL/SQL。
An ORACLE solution is
Rather than the to_char(1/0) [which raises a DIVIDE_BY_ZERO error), you can use a simple function which does
You can use a user defined aggregate, but I'd be worried about the performance impact of switching between SQL and PL/SQL.
这是我对 postgres 的实现(编辑为将
null
也视为唯一值):用于测试的示例表:
Here is my implementation for postgres (edited to treat
null
as a unique value too):example tables for testing:
您应该在 (product_id,description) 上定义一个 UNIQUE 约束,这样您就不必担心一个产品有两个描述。
You should define a UNIQUE constraint on (product_id, description), then you never have to worry about there being two descriptions for one product.
这是我对 Oracle 的实现 - 不幸的是,我认为每种基本类型都需要一个实现:
And here is my implementation for Oracle - unfortunately I think you need one implementation for each base type: