在 MySQL 中使用 GROUP BY 选择最新行

发布于 2024-11-01 09:41:12 字数 406 浏览 1 评论 0原文

我正在尝试选择最近付款的每个用户。我现在的查询选择用户首次付款。即，如果用户已进行两次付款且 payment.id 分别为 10 和 11，则查询将选择付款 id 信息为 10 而不是 11 的用户。

  SELECT users.*, payments.method, payments.id AS payment_id 
    FROM `users` 
         LEFT JOIN `payments` ON users.id = payments.user_id 
GROUP BY users.id

我已添加 ORDER BY payment.id，但查询似乎忽略它，仍然选择第一笔付款。

感谢所有帮助。谢谢。

原文

I'm trying to select each user with their most recent payment. The query I have now selects the users first payment. I.e. if a user has made two payments and the payment.ids are 10 and 11, the query selects the user with the info for payment id 10, not 11.

  SELECT users.*, payments.method, payments.id AS payment_id 
    FROM `users` 
         LEFT JOIN `payments` ON users.id = payments.user_id 
GROUP BY users.id

I've added ORDER BY payments.id, but the query seems to ignore it and still selects the first payment.

All help appreciated.
Thanks.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

寂寞陪衬 2024-11-08 09:41:12

您想要分组最大值；本质上，对付款表进行分组以识别最大记录，然后将结果与其自身连接起来以获取其他列：

SELECT users.*, payments.method, payments.id AS payment_id
FROM   payments NATURAL JOIN (
  SELECT   user_id, MAX(id) AS id 
  FROM     payments
  GROUP BY user_id
) t RIGHT JOIN users ON users.id = t.user_id

请注意，MAX(id) 可能不是“最近的付款” ”，具体取决于您的应用程序和架构：基于 TIMESTAMP 确定“最近”通常比基于合成标识符（例如 ）更好AUTO_INCRMENT 主键列。

You want the groupwise maximum; in essence, group the payments table to identify the maximal records, then join the result back with itself to fetch the other columns:

SELECT users.*, payments.method, payments.id AS payment_id
FROM   payments NATURAL JOIN (
  SELECT   user_id, MAX(id) AS id 
  FROM     payments
  GROUP BY user_id
) t RIGHT JOIN users ON users.id = t.user_id

Note that MAX(id) may not be the "most recent payment", depending on your application and schema: it's usually better to determine "most recent" based off TIMESTAMP than based off synthetic identifiers such as an AUTO_INCREMENT primary key column.

回复收藏 0 原文

夜还是长夜 2024-11-08 09:41:12

我很久以前就读过以下解决方案，但我找不到信用链接，但这里是：

SELECT users.*, payments.method, payments.id AS payment_id, payments2.id
FROM users
JOIN payments
    ON users.id = payments.user_id 
LEFT JOIN payments2
    ON payments.user_id = payments2.user_id
    AND payments.id < payments2.id
WHERE payments2.id IS NULL

要了解其工作原理，只需删除 WHERE payment2.id IS NULL 即可将看到发生了什么，例如它可能会产生以下输出（我还没有构建模式来测试它，所以它是伪输出）。假设 payments 中有以下记录：

id | user_id | method
1  | 1       | VISA
2  | 1       | VISA
3  | 1       | VISA
4  | 1       | VISA

上面的 SQL（没有 WHERE payment2.id IS NULL 子句）应该产生：

users.id | payments.method | payments.id | payments2.id
1        | VISA            | 1           | 2
1        | VISA            | 1           | 3
1        | VISA            | 1           | 4
1        | VISA            | 2           | 3
1        | VISA            | 2           | 4
1        | VISA            | 3           | 4
1        | VISA            | 4           | NULL

如您所见，最后一行产生期望的结果，并且由于没有 payments2.id > 4，LEFT JOIN 结果为 payment2.id = NULL。

我发现这个解决方案比接受的答案要快得多（根据我的早期测试）。

使用不同的模式但类似的查询，包含 16095 条记录：

select as1.*, as2.id
from allocation_status as1
left join allocation_status as2 
    on as1.allocation_id = as2.allocation_id
    and as1.id < as2.id
where as2.id is null;

16095 rows affected, taking 4.1ms

与 MAX / 子查询的接受答案相比：

SELECT as1.* 
FROM allocation_status as1
JOIN (
    SELECT max(id) as id
    FROM allocation_status
    group by allocation_id
) as_max on as1.id = as_max.id 

16095 rows affected, taking 14.8ms

I read the following solution on SO long ago, but I can't find the link to credit, but here goes:

SELECT users.*, payments.method, payments.id AS payment_id, payments2.id
FROM users
JOIN payments
    ON users.id = payments.user_id 
LEFT JOIN payments2
    ON payments.user_id = payments2.user_id
    AND payments.id < payments2.id
WHERE payments2.id IS NULL

To understand how this works, just drop the WHERE payments2.id IS NULL and you'll see what is happening, for instance it could produce the following output (I haven't build the schema to test this, so it's pseudo-output). Assume there are the following records in payments:

id | user_id | method
1  | 1       | VISA
2  | 1       | VISA
3  | 1       | VISA
4  | 1       | VISA

And the above SQL (without the WHERE payments2.id IS NULL clause) should produce:

users.id | payments.method | payments.id | payments2.id
1        | VISA            | 1           | 2
1        | VISA            | 1           | 3
1        | VISA            | 1           | 4
1        | VISA            | 2           | 3
1        | VISA            | 2           | 4
1        | VISA            | 3           | 4
1        | VISA            | 4           | NULL

As you can see the the last line produces the desired result, and since there's no payments2.id > 4, the LEFT JOIN results in a payments2.id = NULL.

I've found this solution to be much faster (from my early tests) than the accepted answer.

Using a different schema but a similar query, of 16095 records:

select as1.*, as2.id
from allocation_status as1
left join allocation_status as2 
    on as1.allocation_id = as2.allocation_id
    and as1.id < as2.id
where as2.id is null;

16095 rows affected, taking 4.1ms

Compared to the accepted answer of MAX / subquery:

SELECT as1.* 
FROM allocation_status as1
JOIN (
    SELECT max(id) as id
    FROM allocation_status
    group by allocation_id
) as_max on as1.id = as_max.id 

16095 rows affected, taking 14.8ms

回复收藏 0 原文

栖竹 2024-11-08 09:41:12

我刚刚处理了几乎完全相同的问题，发现这些答案很有帮助。我的测试似乎表明您可以使其比接受的答案稍微简单一些，即：

SELECT u.*, p.method, p.id AS payment_id 
FROM `users` u, `payments` p
WHERE u.id = p.user_id 
    AND p.id = (SELECT MAX(p2.id) FROM payments p2
                    WHERE p2.user_id = u.id);

我没有对差异进行性能测试，但我正在处理的数据库有超过 50,000 个用户和超过 60,000 笔付款，查询在 0.024 秒内运行。

I've just been dealing with pretty much exactly the same problem and found these answers helpful. My testing seems to suggest you can make it slightly simpler than the accepted answer, viz.:

SELECT u.*, p.method, p.id AS payment_id 
FROM `users` u, `payments` p
WHERE u.id = p.user_id 
    AND p.id = (SELECT MAX(p2.id) FROM payments p2
                    WHERE p2.user_id = u.id);

I've not performance tested the differences but the db I'm working on has over 50,000 Users and over 60,000 payments and the query runs in 0.024 seconds.

回复收藏 0 原文

黑凤梨 2024-11-08 09:41:12

更进一步，我们还可以使用：

select payment_id, cust_id, amount, payment_method 
from my_table where payment_id in 
(
    select max(payment_id) from my_table group by cust_id
);

...但是这个查询在我的上下文中也花费了太长的时间。内部选择速度很快，但外部选择需要一段时间，并且内部只有 124 个结果。有想法吗？

Taking this one step further, we can also use:

select payment_id, cust_id, amount, payment_method 
from my_table where payment_id in 
(
    select max(payment_id) from my_table group by cust_id
);

...but this query is also taking way too long in my context. The inner select is smoking fast, but the outer takes a while, and with only 124 results from the inner. Ideas?

回复收藏 0 原文

爱给你人给你 2024-11-08 09:41:12

我的解决方案：

SELECT

u.codigo, 
u.nome,  
max(r.latitude),  
max(r.longitude),  
max(r.data_criacao) 

from TAB_REGISTRO_COORDENADAS  r

inner join TAB_USUARIO u

on u.codigo = r.cd_usuario

group by u.codigo

My solution:

SELECT

u.codigo, 
u.nome,  
max(r.latitude),  
max(r.longitude),  
max(r.data_criacao) 

from TAB_REGISTRO_COORDENADAS  r

inner join TAB_USUARIO u

on u.codigo = r.cd_usuario

group by u.codigo

回复收藏 0 原文

天涯沦落人 2024-11-08 09:41:12

我以前遇到过这个。分组依据更适合聚合表达式或相同记录。我的研究发现最好的做法是这样做：

    SELECT  u.*, p.method, p.id AS payment_id
    FROM    (
        SELECT  DISTINCT users.id
        FROM    users
        ) ur
    JOIN    payments p
    ON      p.id =
        (
        SELECT  pt.id
        FROM    payments pt
        WHERE   pt.user_id = ur.id
        ORDER BY
                pt.id DESC
        LIMIT 1
        )

I have come across this before. Group by's are more intended for aggregate expressions or identical records. My research found it is best practice to do something like this:

    SELECT  u.*, p.method, p.id AS payment_id
    FROM    (
        SELECT  DISTINCT users.id
        FROM    users
        ) ur
    JOIN    payments p
    ON      p.id =
        (
        SELECT  pt.id
        FROM    payments pt
        WHERE   pt.user_id = ur.id
        ORDER BY
                pt.id DESC
        LIMIT 1
        )

回复收藏 0 原文

~没有更多了~