MySQL 中的 ROW_NUMBER()
MySQL 中有没有一种很好的方法来复制 SQL Server 函数ROW_NUMBER()
?
例如:
SELECT
col1, col2,
ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY col3 DESC) AS intRow
FROM Table1
然后我可以添加一个条件将 intRow
限制为 1,以获得每个 (col1, col2) 具有最高
对。col3
的单行)
Is there a nice way in MySQL to replicate the SQL Server function ROW_NUMBER()
?
For example:
SELECT
col1, col2,
ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY col3 DESC) AS intRow
FROM Table1
Then I could, for example, add a condition to limit intRow
to 1 to get a single row with the highest col3
for each (col1, col2)
pair.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(26)
MySQL 5.7 或更低版本中没有排名功能。 (MySQL v8.0+ 支持此功能,请参阅@LukaszSzozda 的回答
)可以得到的是使用一个变量:
是的。如果是 Oracle,您可以使用 LEAD 函数在下一个值处达到峰值。值得庆幸的是,Quassnoi 涵盖了您需要的逻辑在 MySQL 中实现。
There is no ranking functionality in MySQL 5.7 or below. (This is supported in MySQL v8.0+, see @LukaszSzozda's answer)
The closest you can get is to use a variable:
Yes. If it were Oracle, you could use the LEAD function to peak at the next value. Thankfully, Quassnoi covers the logic for what you need to implement in MySQL.
这是分组最大值,最常见的一种- 问 SQL 问题(因为看起来应该很容易,但实际上并非如此)。
我经常喜欢空自连接:
“获取表中没有其他匹配 col1、col2 的行具有更高 col3 的行。” (您会注意到,如果多行具有相同的 col1、col2、col3,则此解决方案和大多数其他分组最大解决方案将返回多行。如果这是一个问题,您可能需要一些后处理。)
That's a groupwise maximum, one of the most commonly-asked SQL questions (since it seems like it should be easy, but actually it kind of isn't).
I often plump for a null-self-join:
“Get the rows in the table for which no other row with matching col1,col2 has a higher col3.” (You will notice this and most other groupwise-maximum solutions will return multiple rows if more than one row has the same col1,col2,col3. If that's a problem you may need some post-processing.)
我总是遵循这种模式。给定此表:
您可以获得此结果:
通过运行此查询,不需要定义任何变量:
I always end up following this pattern. Given this table:
You can get this result:
By running this query, which doesn't need any variable defined:
从
MySQL 8.0.0
及更高版本开始,您可以本机使用窗口函数。1.4 MySQL 8.0 中的新增功能:
ROW_NUMBER() over_clause :
演示:
DBFiddle 演示
From
MySQL 8.0.0
and above you could natively use windowed functions.1.4 What Is New in MySQL 8.0:
ROW_NUMBER() over_clause :
Demo:
DBFiddle Demo
查看这篇文章,它展示了如何在 MySQL 中使用分区来模拟 SQL ROW_NUMBER()。我在 WordPress 实施中遇到了同样的情况。我需要 ROW_NUMBER() 但它不存在。
http://www.explodybits.com/2011/11/mysql-row- number/
文章中的示例是使用单个按字段分区。要按其他字段进行分区,您可以执行以下操作:
使用 concat_ws 处理 null。我使用 int、date 和 varchar 对 3 个字段进行了测试。希望这有帮助。查看这篇文章,它分解了这个查询并对其进行了解释。
Check out this Article, it shows how to mimic SQL ROW_NUMBER() with a partition by in MySQL. I ran into this very same scenario in a WordPress Implementation. I needed ROW_NUMBER() and it wasn't there.
http://www.explodybits.com/2011/11/mysql-row-number/
The example in the article is using a single partition by field. To partition by additional fields you could do something like this:
Using concat_ws handles null's. I tested this against 3 fields using an int, date, and varchar. Hope this helps. Check out the article as it breaks this query down and explains it.
我也会投票支持 Mosty Mostacho 的解决方案,对其查询代码进行少量修改:
这将给出相同的结果:
对于表:
唯一的区别是查询不使用 JOIN 和 GROUP BY,而是依赖于嵌套选择。
I would also vote for Mosty Mostacho's solution with minor modification to his query code:
Which will give the same result:
for the table:
With the only difference that the query doesn't use JOIN and GROUP BY, relying on nested select instead.
我会定义一个函数:
然后我可以这样做:
现在你没有子查询,而视图中没有子查询。
I would define a function:
then I could do:
Now you don't have a subquery, which you can't have in views.
在mysql中查询row_number
query for row_number in mysql
MySQL 中没有像
rownum
、row_num()
这样的函数,但解决方法如下:There is no funtion like
rownum
,row_num()
in MySQL but the way around is like below:重要提示:请考虑升级到 MySQL 8+ 并使用已定义和记录的 ROW_NUMBER() 函数,并放弃与功能有限的旧版本 MySQL 相关的旧黑客
现在这是这些黑客之一:
答案在这里大多数/全部使用查询中变量似乎忽略了文档所说的事实(释义):
因此,他们存在产生错误答案的风险,因为他们通常会执行“
如果这些自下而上进行评估,则行号将停止工作(不分区)
所以我们需要使用有保证执行顺序的东西。输入案例:
作为轮廓 ld,prevcol 的分配顺序很重要 - 在我们为当前行分配值之前,必须将 prevcol 与当前行的值进行比较(否则它将是当前行的 col 值,而不是上一个行的值)行的列值)。
以下是它们的组合方式:
第一个 WHEN 被评估。如果该行的 col 与前一行的 col 相同,则 @r 递增并从 CASE 返回。该返回值存储在@r 中。 MySQL 的一个功能是,赋值会将分配给 @r 的新值返回到结果行中。
对于结果集上的第一行,@prevcol 为 null(它在子查询中初始化为 null),因此该谓词为 false。每次 col 更改时,第一个谓词也会返回 false(当前行与前一行不同)。这会导致评估第二个 WHEN。
第二个 WHEN 谓词始终为 false,它的存在纯粹是为了向 @prevcol 分配新值。因为这一行的 col 与前一行的 col 不同(我们知道这一点,因为如果它相同,则将使用第一个 WHEN),因此我们必须分配新值以保留它以供下次测试。因为先进行赋值,然后将赋值结果与 null 进行比较,而与 null 等同的任何内容都是 false,因此该谓词始终为 false。但至少评估它做了保留该行的 col 值的工作,因此可以根据下一行的 col 值对其进行评估
因为第二个 WHEN 为 false,这意味着在我们要分区的列的情况下(col) 已更改,是 ELSE 为 @r 提供了新值,从 1 重新开始编号
我们遇到这样的情况:
具有一般形式:
脚注:
pcol 中的 p 表示“分区”,o 表示“分区” ocol 中的意思是“顺序” - 在一般形式中,我从变量名称中删除了“prev”以减少视觉混乱
(@pcolX := colX) = null
周围的括号很重要。如果没有它们,您会将 null 分配给 @pcolX 并且事情会停止工作这是一个折衷方案,结果集也必须按分区列排序,以便与前一列进行比较才能计算出来。因此,您不能根据一列对行号进行排序,但将结果集排序到另一列您也许可以使用子查询来解决此问题,但我相信文档还指出,除非使用 LIMIT,否则可能会忽略子查询排序,这可能会影响性能
除了测试该方法是否有效之外,我还没有深入研究它,但是如果存在第二个 WHEN 中的谓词将被优化掉的风险(与 null 相比的任何内容都是 null/false 那么为什么还要运行分配)并且不执行,它也会停止。根据我的经验,这似乎不会发生,但如果可以合理地发生,我会很乐意接受评论并提出解决方案
将创建 @pcolX 的空值转换为列的实际类型可能是明智的,在创建 @pcolX 变量的子查询,即:
select @pcol1 := CAST(null as INT), @pcol2 := CAST(null as DATE)
Important: Please consider upgrading to MySQL 8+ and use the defined and documented ROW_NUMBER() function, and ditch old hacks tied to a feature limited ancient version of MySQL
Now here's one of those hacks:
The answers here that use in-query variables mostly/all seem to ignore the fact that the documentation says (paraphrase):
As such, there's a risk they will churn out the wrong answer, because they typically do a
If these are ever evaluated bottom up, the row number will stop working (no partitions)
So we need to use something with a guaranteed order of execution. Enter CASE WHEN:
As outline ld, order of assignment of prevcol is important - prevcol has to be compared to the current row's value before we assign it a value from the current row (otherwise it would be the current rows col value, not the previous row's col value).
Here's how this fits together:
The first WHEN is evaluated. If this row's col is the same as the previous row's col then @r is incremented and returned from the CASE. This return led values is stored in @r. It's a feature of MySQL that assignment returns the new value of what is assigned into @r into the result rows.
For the first row on the result set, @prevcol is null (it is initialised to null in the subquery) so this predicate is false. This first predicate also returns false every time col changes (current row is different to previous row). This causes the second WHEN to be evaluated.
The second WHEN predicate is always false, and it exists purely to assign a new value to @prevcol. Because this row's col is different to the previous row's col (we know this because if it were the same, the first WHEN would have been used), we have to assign the new value to keep it for testing next time. Because the assignment is made and then the result of the assignment is compared with null, and anything equated with null is false, this predicate is always false. But at least evaluating it did its job of keeping the value of col from this row, so it can be evaluated against the next row's col value
Because the second WHEN is false, it means in situations where the column we are partitioning by (col) has changed, it is the ELSE that gives a new value for @r, restarting the numbering from 1
We this get to a situation where this:
Has the general form:
Footnotes:
The p in pcol means "partition", the o in ocol means "order" - in the general form I dropped the "prev" from the variable name to reduce visual clutter
The brackets around
(@pcolX := colX) = null
are important. Without them you'll assign null to @pcolX and things stop workingIt's a compromise that the result set has to be ordered by the partition columns too, for the previous column compare to work out. You can't thus have your rownumber ordered according to one column but your result set ordered to another You might be able to resolve this with subqueries but I believe the docs also state that subquery ordering may be ignored unless LIMIT is used and this could impact performance
I haven't delved into it beyond testing that the method works, but if there is a risk that the predicates in the second WHEN will be optimised away (anything compared to null is null/false so why bother running the assignment) and not executed, it also stops. This doesn't seem to happen in my experience but I'll gladly accept comments and propose solution if it could reasonably occur
It may be wise to cast the nulls that create @pcolX to the actual types of your columns, in the subquery that creates the @pcolX variables, viz:
select @pcol1 := CAST(null as INT), @pcol2 := CAST(null as DATE)
我发现最有效的解决方案是使用这样的子查询:
PARTITION BY 列只需与“=”进行比较并用 AND 分隔。 ORDER BY 列将与“<”进行比较或“>”,并用 OR 分隔。
我发现这非常灵活,即使成本有点高。
The solution I found to work the best was using a subquery like this:
The PARTITION BY columns just get compared with '=' and separated by AND. The ORDER BY columns would be compared with '<' or '>', and separated by OR.
I've found this to be very flexible, even if it is a little bit costly.
rownumber 功能无法被模仿。您可能会得到预期的结果,但在某个阶段您很可能会感到失望。
mysql 文档是这么说的:
问候,
乔治.
The rownumber functionality can't be mimicked. You might get the results you expect, but you'll most likely get disappointed at some stage.
Here's what mysql documentation says:
Regards,
Georgi.
MariaDB 10.2 正在实现“窗口函数”,包括 RANK()、ROW_NUMBER() 和其他几个功能:
https://mariadb.com/kb/en/mariadb/window-functions/
根据本月 Percona Live 上的演讲,它们得到了相当好的优化。
语法与问题中的代码相同。
MariaDB 10.2 is implementing "Window Functions", including RANK(), ROW_NUMBER() and several other things:
https://mariadb.com/kb/en/mariadb/window-functions/
Based on a talk at Percona Live this month, they are reasonably well optimized.
The syntax is identical to the code in the Question.
MySQL 从 8.0+ 版本开始支持 ROW_NUMBER()。
如果您使用 MySQL 8.0 或更高版本,请查看 ROW_NUMBER() 函数。
否则,您可以模拟 ROW_NUMBER() 函数。
row_number() 是一个排名函数,它返回行的序号,第一行从 1 开始。
对于旧版本,
MySQL has supported the ROW_NUMBER() since version 8.0+.
If you use MySQL 8.0 or later, check it out ROW_NUMBER() function.
Otherwise, you have emulate ROW_NUMBER() function.
The row_number() is a ranking function that returns a sequential number of a row, starting from 1 for the first row.
for older version,
这允许在 MySQL 中实现 ROW_NUMBER() AND PARTITION BY 提供的相同功能
This allows the same functionality that ROW_NUMBER() AND PARTITION BY provides to be achieved in MySQL
我没有看到任何涵盖“PARTITION BY”部分的简单答案,所以这是我的:
在这个简单的示例中,我只放置了一个,但您可以有多个“PARTITION BY”部分
I don't see any simple answer covering the "PARTITION BY" part so here's mine :
In this simple example I only put one but you can have several "PARTITION BY" parts
这也可能是一个解决方案:
This could also be a solution:
如果您的查询包含
GROUP BY
语句,则使用交叉连接和逗号的解决方案将不起作用。对于这种情况,您可以使用子选择:Solutions with cross join and comma won't work if your query has
GROUP BY
statement. For such cases you can use subselect:我认为你可以在这里使用 DENSE_RANK() 函数。
示例:
https://www.mysqltutorial.org/mysql-window -functions/mysql-dense_rank-function/
I think you can use DENSE_RANK() function here.
Example:
https://www.mysqltutorial.org/mysql-window-functions/mysql-dense_rank-function/
有点晚了,但也可能对寻找答案的人有所帮助...
在 rows/row_number 示例 - 可以在任何 SQL 中使用的递归查询:
A bit late but may also help to someone who looks for answers...
Between rows/row_number example - recursive query that may be used in any SQL:
也有点晚了,但今天我有同样的需求,所以我在谷歌上搜索,最后在 Pinal Dave 的文章 http://blog.sqlauthority.com/2014/03/09/mysql-reset-row -number-for-each-group-partition-by-row-number/
我想关注 Paul 最初的问题(这也是我的问题),因此我将我的解决方案总结为一个工作示例。
因为我们想要对两列进行分区,所以我会在迭代期间创建一个 SET 变量来标识是否启动了新组。
3 意味着 MAKE_SET 的第一个参数我想要 SET 中的两个值 (3=1|2)。
当然,如果我们没有两个或更多列来构造组,我们可以消除 MAKE_SET 操作。结构完全相同。这对我来说是按要求工作的。非常感谢 Pinal Dave 的清晰演示。
Also a bit late but today I had the same need so I did search on Google and finally a simple general approach found here in Pinal Dave's article http://blog.sqlauthority.com/2014/03/09/mysql-reset-row-number-for-each-group-partition-by-row-number/
I wanted to focus on Paul's original question (that was my problem as well) so I summarize my solution as a working example.
Beacuse we want to partition over two column I would create a SET variable during the iteration to identify if a new group was started.
The 3 means at the first parameter of MAKE_SET that I want both value in the SET (3=1|2).
Of course if we do not have two or more columns constructing the groups we can eliminate the MAKE_SET operation. The construction is exactly the same. This is working for me as required. Many thanks to Pinal Dave for his clear demonstration.
这不是最强大的解决方案 - 但如果您只是想在只有几个不同值的字段上创建分区排名,那么使用具有所需数量的变量的逻辑时可能并不难处理。
过去这样的事情对我有用:
希望这是有意义的/有帮助!
This is not the most robust solution - but if you're just looking to create a partitioned rank on a field with only a few different values, it may not be unwieldily to use some case when logic with as many variables as you require.
Something like this has worked for me in the past:
Hope that makes sense / helps!
2023 年仍然支持 MySQL 5.7.38,并且需要 ROW_NUMBER() 我最终做了这样的事情:
结果:
Still supporting MySQL 5.7.38 in 2023, and needing ROW_NUMBER() I ended up doing something like this:
Results:
当我们有多个列时,这非常适合我创建 RowNumber。在本例中为两列。
This Work perfectly for me to create RowNumber when we have more than one column. In this case two column.
对于另一列的分区,一种方法是@abcdn 描述的方法。然而,它的性能较低。 我建议使用此代码,它不需要将表与其自身连接起来:
考虑同一张桌子。
你可以像这样进行分区:
结果将是这样的:
优点是我们不需要将表与其自身连接
for the partioning over anothe column one way is that described by @abcdn. However, it has a low performance. I propose use this code which does not require joining a table with itself:
Considee the same table.
you can get paritioning like this:
the reult would be like this :
The advantage is we do not need to join table with itself