我对 SQL MAX() 有什么不明白的地方?
数据:(log_time 是 DATETIME
类型)
log_id | action | log_time | user
--------------------------------------------------
1 Processed 2011-02-28 16:38:48 1
2 Processed 2011-03-02 16:56:43 5
3 Processed 2011-03-02 17:00:17 5
4 Processed 2011-03-03 08:59:33 5
查询:
SELECT log_time, user
FROM logs
WHERE action = "Processed"
GROUP BY action
HAVING MAX(log_time)
结果:
log_time | user
--------------------------
2011-02-28 16:38:48 1
显然,这根本没有最大 log_time。如果我将查询更改为...,
SELECT MAX(log_time), user
FROM logs
WHERE action = "Processed"
那么我自然会得到:
log_time | user
--------------------------
2011-03-03 08:59:33 1
现在,我显然想要的数据是第 4 行中的数据:3 月 3 日,但是用户 5。我知道我可以得到这个通过执行简单的SELECT ... ORDER BY log_time DESC LIMIT 1
。但我的问题是,我对这些不正确的 MAX()
查询做了什么?在我看来,如果我使用 HAVING MAX()
运行查询,它会给我具有最大值的行。我不明白 MAX() 的工作原理是什么?
编辑:为了详细说明我的问题,基本上,当我看到一个查询时......
SELECT * FROM logs WHERE action = "Processed"
GROUP BY action HAVING MAX(log_time)
我的假设是,根据代码的显示方式,它将检索具有最大 log_time 的 row ,其中 action 是已处理。这似乎是一个错误的假设。那么,HAVING MAX()
到底是什么意思呢?
Data: (log_time is a DATETIME
type)
log_id | action | log_time | user
--------------------------------------------------
1 Processed 2011-02-28 16:38:48 1
2 Processed 2011-03-02 16:56:43 5
3 Processed 2011-03-02 17:00:17 5
4 Processed 2011-03-03 08:59:33 5
Query:
SELECT log_time, user
FROM logs
WHERE action = "Processed"
GROUP BY action
HAVING MAX(log_time)
Result:
log_time | user
--------------------------
2011-02-28 16:38:48 1
Clearly, this is not having the max log_time at all. If I change the query to...
SELECT MAX(log_time), user
FROM logs
WHERE action = "Processed"
Then I get, naturally:
log_time | user
--------------------------
2011-03-03 08:59:33 1
Now, the data I obviously want is the data in row 4: March 3, but user 5. I understand that I can get this by doing a simple SELECT ... ORDER BY log_time DESC LIMIT 1
. But my question is, what am I doing with these MAX()
queries that isn't correct? It would seem to me that if I ran a query with a HAVING MAX()
that it would give me the row that, well, had the max. What am I not understanding about how MAX()
works?
Edit: To elaborate my question, basically, when I see a query...
SELECT * FROM logs WHERE action = "Processed"
GROUP BY action HAVING MAX(log_time)
... my assumption, based on how the code appears is that it will retrieve the row with the largest log_time where action is Processed. This appears to be a faulty assumption. What, then, does HAVING MAX()
even mean?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您需要为聚合函数包含一个 GROUP BY 子句。
或者,如果您正在寻找单个值,则根本不包括用户:
最后,如果您希望用户与该单个最大值关联,请使用子查询
You need to include a GROUP BY clause for the aggregate function.
OR, if you're looking for a single value, don't include user at all:
And finally, if you want the user associated with that single max value, use a subquery
根据其他人的输入,特别是 Damien_The_Unknowner,我意识到我的问题是
HAVING MAX()
实际上没有做任何事情。它只会输出日期,这不起作用,因为它没有与任何内容进行比较。当我说
HAVING MAX(log_time)
时,它会转换为HAVING 2011-03-03 08:59:33
,它不会告诉 SQL 它应该有什么,它只是一个语句,如IF (5)
。我认为。HAVING
对我来说仍然有点神秘,但我认为这就是这个特定问题让我悲伤的原因。Based on the input of others, particularly Damien_The_Unbeliever, I realized that my problem was that
HAVING MAX()
doesn't actually do anything. It will just pipe out the date, which doesn't work as it's not being compared to anything.When I say
HAVING MAX(log_time)
, that translates toHAVING 2011-03-03 08:59:33
, which doesn't tell the SQL what it's supposed to have, it's just a statement, likeIF (5)
. I think.HAVING
continues to be somewhat of a mystery for me, but I think this is the reason why this particular issue was causing me grief.我会按 MAX(log_time) 排序并选择前 1 个,因为无论如何都会进行排序。
I would order by MAX(log_time) and select the top 1, since the sorting will happen anyway.
这给出了输出
2011-03-03 08:59:33 5
this give the output
2011-03-03 08:59:33 5
HAVING
子句用于过滤不满足特定条件的组。按照您定义条件的方式,“已处理”组确实与HAVING
子句匹配,因为它的 MAX(log_time) 包含最大 log_time (或者可能HAVING MAX(log_time)
code> 的计算结果为 true。无论哪种方式,这都是一个奇怪的语句)...您可能想要的是
SELECT
最大 log_time,在这种情况下,查询将是:现在,您得到的原因是“ 1" 作为用户的原因是用户列不是 GROUP BY 子句的一部分。这意味着 MySQL 不知道您想要输出中的哪个用户行。它实际上可以是 4 行中的任何一行。因此,提出问题的另一种方法是:
注意:您作为示例给出的 SQL 查询不是根据标准 SQL 的有效 SQL 查询。它可以在 mysql 上运行,但这是由于/感谢 MySQL 是如何实现 GROUP BY 的。在标准 SQL 中,您唯一可以选择的是 GROUP BY 子句中提到的聚合函数和/或列的结果。
因此,在其他数据库系统中,您将无法选择 < code>user-列,因为它不是
GROUP BY
-列,也不是聚合函数的结果。为了使其成为有效的标准 SQL,您必须编写:The
HAVING
-clause is used the filter GROUPS that do not meet a certain criteria. The way you have defined the criteria, the "processed"-group DOES match theHAVING
clause, since its MAX(log_time) contains the maximum log_time (Or perhapsHAVING MAX(log_time)
evaluates to true. Either way, it's a weird statement)...What your probably want is to
SELECT
the maximum log_time, in which case the query would be:Now, the reason you get "1" as user is that the user-column is not part of the GROUP BY clause. This means that MySQL doesn't know which user-row that you want in your output. It could be any of the 4 rows really. So, another way to formulate the question would be:
NOTE: the SQL query that you give as an example is not a valid SQL query according to Standard SQL. It works on mysql, but that's due/thanks to how MySQL has implemented GROUP BY. In standard SQL, the only thing you can select are results of aggregate functions and/or columns mentioned in the GROUP BY clause.
So, in other database systems, you wouldn't be able to select the
user
-column, since it is not aGROUP BY
-column and not the result of an aggregate function. In order for it to be valid Standard SQL, you'd have to write: