将多行分组在一起时,SQL 如何选择显示哪一行?
考虑下表:
CREATE TABLE t
(
a INTEGER NOT NULL,
b INTEGER NOT NULL,
c INTEGER,
PRIMARY KEY (a, b)
)
现在,如果我这样做:
SELECT a,b,c FROM t GROUP BY a;
我希望只获得 a 的每个不同值一次。但由于我也要求 b 和 c,因此它将为 a 的每个值提供一行。因此,如果对于a的单个值,有很多行可供选择,我如何预测SQL将选择哪一行?我的测试表明它选择返回 b 最大的行。但这其中的逻辑是什么?这将如何应用于 blob 或日期或其他字符串?
我的问题:将多行分组在一起时,SQL 如何选择显示哪一行?
顺便说一句:我的特殊问题涉及 SQLITE3,但我猜这是一个不依赖于 DBMS 的 SQL 问题......
Consider the following table:
CREATE TABLE t
(
a INTEGER NOT NULL,
b INTEGER NOT NULL,
c INTEGER,
PRIMARY KEY (a, b)
)
Now if I do this:
SELECT a,b,c FROM t GROUP BY a;
I expect to have get each distinct value of a only once. But since I'm asking for b and c as well, it's going to give me a row for every value of a. Therefor, if, for a single value of a, there are many rows to choose from, how can I predict which row SQL will choose? My tests show that it chooses to return the row for which b is the greatest. But what is the logic in that? How would this apply to strings of blobs or dates or anything else?
My question: How does SQL choose which row to show when grouping multiple rows together?
btw: My particular problem concerns SQLITE3, but I'm guessing this is an SQL issue not dependent of the DBMS...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这实际上不应该在一个像样的 DBMS 中工作:-)
group by
子句中未使用的任何列都应该受聚合函数的约束,例如:如果它不不要抱怨 SQLite(而且我没有立即理由怀疑你),我只是将其归结为 DBMS 的构建方式。根据记忆,有一些区域不太担心数据的“纯度”(例如每列都能够保存多种类型,属于该行/列相交中的数据的类型而不是列规范)。
That shouldn't actually work in a decent DBMS :-)
Any column not used in the
group by
clause should be subject to an aggregation function, such as:If it doesn't complain in SQLite (and I have no immediate reason to doubt you), I'd just put it down to the way the DBMS is built. From memory, there's a few areas where it doesn't worry too much about the "purity" of the data (such as every column being able to hold multiple types, the type belonging to the data in that row/column intersect rather than the column specification).
我所知道的所有 SQL 引擎都会抱怨您提到的查询,并显示错误消息,例如“b 和 c 出现在字段列表中,但不在分组依据列表中”。您只能在聚合函数中使用 b 或 c(例如 MAX / MIN / COUNT / AVG 等),否则您将被迫将它们添加到 GROUP BY 列表中。
All the SQL engines that I know will complain about the query that you mentioned with an error message like "b and c appear in the field list but not in the group by list". You are only allowed to use b or c in an aggregate function (like MAX / MIN / COUNT / AVG whatever) or you'll be forced to add them in the GROUP BY list.
您关于这是独立于 RDBMS 的假设并不完全正确。大多数 RDBMS 不允许选择不在 GROUP BY 子句中的字段。 (据我所知)SQLite 和 MySQL 是例外。一般来说,您不应该这样做,因为
b
和c
的值是相当任意选择的(取决于所应用的分组算法)。即使这可能记录在您的数据库中,最好以完全且明确地指定结果的方式表达查询You're not quite correct about your assumption that this is RDBMS-independent. Most RDBMS don't allow to select fields that are not also in the
GROUP BY
clause. Exceptions to this (to my knowledge) are SQLite and MySQL. In general, you shouldn't do this, because values forb
andc
are chosen pretty arbitrarily (depending on the applied grouping algorithm). Even if this may be documented in your database, it's always better to express a query in a way that fully and non-ambiguously specifies the outcome这不是数据库选择什么的问题,而是数据返回的顺序的问题。
您的主键默认处理您的排序顺序,因为您没有提供排序顺序。
如果您需要的话,可以使用 Order By a, c。
It's not a matter of what the database will choose, but the order your data are going to be returned.
Your primary key is handling your sort order by default since you didn't provide one.
You can use Order By a, c if that's what you want.