消除来自 select 的 group by 中使用的列的歧义
假设存在一个包含 id、first_name、last_name 列的表 person
,并且存在包含 id、owner_id、name 列的表“dog”。
采取以下查询:
Select
first_name + last_name as name,
count(*) dog_count
from
person
join dogs on person.id = dog.owner_id
group by
person.id,
first_name + last_name;
此查询有效并用于此目的,但是每当我必须编写 group by 子句时,有一件事情让我烦恼,我今天去寻找解决方案,但并没有真正找到我喜欢的解决方案。
当处理 group by 时(由于执行顺序),查询 select 子句中的 name
不存在。许多 RDMS 无论如何都允许它作为扩展,但即使是大多数(我认为全部)也会遇到以下问题:如果我使用 name
而不是 first_name + last_name
在 group by 子句中,不是转到 select 子句来使用其中的 name,而是从狗表中转到 name
。
我不知道它是否是标准的一部分,但许多 RDMS 允许您使用列号来指定选择的列,因此我可以执行group by 1
,但这对于较大的可能会动态指定列以便顺序可能发生变化的查询。
所以...最后一个问题:当从 select 子句中提取列时,任何人都知道的 RDMS 是否有办法通过列名称(或其他一致的属性)消除歧义?一个可行的例子是 group by select.name
来指定“从 select 子句中获取列 name
并将其应用到 group by 子句”?
有谁知道有关该主题的讨论是否已被纳入 SQL 标准?
Assume a table person
exists with columns id, first_name, last_name and table 'dog' exists with columns id, owner_id, name exist.
Take the following query:
Select
first_name + last_name as name,
count(*) dog_count
from
person
join dogs on person.id = dog.owner_id
group by
person.id,
first_name + last_name;
This query works and for this purpose, but has 1 thing that erks me whenever I have to write group by clauses and I went searching for solutions today and didn't really ever find one that I liked.
name
from the query select clause doesn't exist when it processes the group by (due to the order of execution). Many RDMS allow it anyways as an extension, but even most (I think all) of those will run afoul of the following problem: if I use name
instead of first_name + last_name
in the group by clause, instead of going to the select clause to use name from there it'll to go name
from the dog table.
I don't know if it's part of the standard, but many RDMS allow you to use column numbers instead to specify columns from the select, so I can do group by 1
, but this becomes problematic with larger queries that may be dynamically specifying columns so that the ordering may change.
So... the final question: does any RDMS that anybody is aware of have a way to disambiguate by column name (or other consistent attribute) when pulling columns from the select clause? An example of something that could work would be group by select.name
to specify 'take the column name
from the select clause and apply it to the group by clause`?
Does anybody know if discussion on the topic has ever been brought as something to make it into SQL standards?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
你是对的,
GROUP BY
发生在SELECT
之前。因此,GROUP BY
还不知道first_name + last_name
在SELECT
中被别名为 name,而不是期望该 name 会引用狗.名字.如果这真的让你烦恼,你可以这样写你的小组。
但是,一般来说,为了避免歧义,您始终可以将表名放在列名之前来指定您所引用的表。假设列名实际存在于基表中。不过,正如前面提到的,这在那里仍然行不通。但是,它可以在
ORDER BY
子句中工作,因为它发生在SELECT
之后。You're right that
GROUP BY
happens beforeSELECT
. Therefore,GROUP BY
does not yet know thatfirst_name + last_name
is aliased as name in theSELECT
, rather than expecting that name would be referring to dog.name.If it really bothers you, you can write your group by as this.
However, in general, regarding ambiguity, you can always put the table name before the column name to specify which table you're referring to. Assuming that column name actually exists in the base table. That still would not work there, though, as mentioned. However, it would work in an
ORDER BY
clause, since that happens after theSELECT
.