SQL查询获取与另一列的最大值相对应的列值？

发布于 2024-11-26 11:16:39 字数 631 浏览 2 评论 0原文

好的，这是我的查询：

SELECT
  video_category,
  video_url,
  video_date,
  video_title,
  short_description,
  MAX(video_id) 
FROM
  videos
GROUP BY
  video_category

当它提取数据时，我得到了 video_id 的正确行，但它为其他类别提取了每个类别的第一行。因此，当我获得类别 1 的 video_id 的最大结果时，我获得最大 ID，但表中的第一行包含 url、日期、标题和描述。

如何让它提取与最大 ID 结果相对应的其他列？

编辑：已修复。

SELECT
    *
FROM
    videos
WHERE
    video_id IN
    (
        SELECT
            DISTINCT
            MAX(video_id)
        FROM
            videos
        GROUP BY
            video_category
    ) 
ORDER BY
    video_category ASC

原文

Ok, this is my query:

SELECT
  video_category,
  video_url,
  video_date,
  video_title,
  short_description,
  MAX(video_id) 
FROM
  videos
GROUP BY
  video_category

When it pulls the data, I get the correct row for the video_id, but it pulls the first row for each category for the others. So when I get the max result for the video_id of category 1, I get the max ID, but the first row in the table for the url, date, title, and description.

How can I have it pull the other columns that correspond with the max ID result?

Edit: Fixed.

SELECT
    *
FROM
    videos
WHERE
    video_id IN
    (
        SELECT
            DISTINCT
            MAX(video_id)
        FROM
            videos
        GROUP BY
            video_category
    ) 
ORDER BY
    video_category ASC

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

掩于岁月 2024-12-03 11:16:39

我会尝试这样的事情：

SELECT
   s.video_id
   ,s.video_category
   ,s.video_url
   ,s.video_date
   ,s.video_title
   ,s.short_description
FROM videos s
   JOIN (SELECT MAX(video_id) AS id FROM videos GROUP BY video_category) max
      ON s.video_id = max.id

这比你自己的解决方案快得多

I would try something like this:

SELECT
   s.video_id
   ,s.video_category
   ,s.video_url
   ,s.video_date
   ,s.video_title
   ,s.short_description
FROM videos s
   JOIN (SELECT MAX(video_id) AS id FROM videos GROUP BY video_category) max
      ON s.video_id = max.id

which is quite a lot faster than your own solution

回复收藏 0 原文

撕心裂肺的伤痛 2024-12-03 11:16:39

我最近发明了一种新技术来处理 MySQL 中的此类问题。

标量聚合缩减

标量聚合缩减是迄今为止实现此目的的最高性能方法和最简单的方法（就数据库引擎而言），因为它不需要联接、子查询和 CTE。

对于您的查询，它看起来像这样：

SELECT
  video_category,
  MAX(video_id) AS video_id,
  SUBSTRING(MAX(CONCAT(LPAD(video_id, 11, '0'), video_url)), 12) AS video_url,
  SUBSTRING(MAX(CONCAT(LPAD(video_id, 11, '0'), video_date)), 12) AS video_date,
  SUBSTRING(MAX(CONCAT(LPAD(video_id, 11, '0'), video_title)), 12) AS video_title,
  SUBSTRING(MAX(CONCAT(LPAD(video_id, 11, '0'), short_description)), 12) AS short_description
FROM
  videos
GROUP BY
  video_category

标量函数和聚合函数的组合执行以下操作：

LPAD 聚合内相关标识符以允许正确的字符串比较（例如，“0009”和“0025”将被正确排序）。假设主键为 INT，我将 LPADDING 为 11 个字符。如果您使用 BIGINT，您将需要增加它以支持表的序数。如果您要比较 DATETIME 字段（固定长度），则不需要填充。
将填充的标识符与输出列相连接（因此您将得到“00000000009myvalue”与“0000000025othervalue”）
最大化聚合集，这将产生“00000000025othervalue”作为获胜者。
对结果进行 SUBSTRING，这将截断比较的标识符部分，仅留下值。

如果您想要检索 CHAR 以外类型的值，则可能需要对输出执行额外的 CAST，例如，如果您希望 video_date 为 DATETIME：

CAST(SUBSTRING(MAX( CONCAT(LPAD(video_id, 11, '0'), video_date)), 12) AS DATETIME)

与自连接方法相比，此方法的另一个好处是您可以组合其他聚合数据（不仅仅是最新值），或者甚至在同一查询中组合第一个和最后一个项目，例如，

SELECT
    -- Overall totals
    video_category,
    COUNT(1) AS videos_in_category,
    DATEDIFF(MAX(video_date), MIN(video_date)) AS timespan,
    
    -- Last video details
    MAX(video_id) AS last_video_id,
    SUBSTRING(MAX(CONCAT(LPAD(video_id, 11, '0'), video_url)), 12) AS last_video_url,
    ...
    
    -- First video details
    MIN(video_id) AS first_video_id,
    SUBSTRING(MIN(CONCAT(LPAD(video_id, 11, '0'), video_url)), 12) AS first_video_url,
    ...
    
    -- And so on

有关解释此方法与其他旧方法相比的优点的更多详细信息，我的完整博客文章位于：https://www.stevenmoseley.com/blog/tech/high-performance-sql-corlated-scalar-aggregate-reduction-queries

I recently invented a new technique to handle this type of problem in MySQL.

SCALAR-AGGREGATE REDUCTION

Scalar-Aggregate Reduction is by far the highest-performance approach and simplest method (in DB engine terms) for accomplishing this, because it requires no joins, no subqueries, and no CTE.

For your query, it would look something like this:

SELECT
  video_category,
  MAX(video_id) AS video_id,
  SUBSTRING(MAX(CONCAT(LPAD(video_id, 11, '0'), video_url)), 12) AS video_url,
  SUBSTRING(MAX(CONCAT(LPAD(video_id, 11, '0'), video_date)), 12) AS video_date,
  SUBSTRING(MAX(CONCAT(LPAD(video_id, 11, '0'), video_title)), 12) AS video_title,
  SUBSTRING(MAX(CONCAT(LPAD(video_id, 11, '0'), short_description)), 12) AS short_description
FROM
  videos
GROUP BY
  video_category

The combination of scalar and aggregate functions does the following:

LPADs the intra-aggregate correlated identifier to allow proper string comparison (e.g. "0009" and "0025" will be properly ranked). I'm LPADDING to 11 characters here assuming an INT primary key. If you use a BIGINT, you will want to increase this to support your table's ordinality. If you're comparing on a DATETIME field (fixed length), no padding is necessary.
CONCATs the padded identifier with the output column (so you get "00000000009myvalue" vs "0000000025othervalue")
MAX the aggregate set, which will yield "00000000025othervalue" as the winner.
SUBSTRING the result, which will truncate the compared identifier portion, leaving only the value.

If you want to retrieve values in types other than CHAR, you may need to performa an additional CAST on the output, e.g. if you want video_date to be a DATETIME:

CAST(SUBSTRING(MAX(CONCAT(LPAD(video_id, 11, '0'), video_date)), 12) AS DATETIME)

Another benefit of this method over the self-joining method is that you can combine other aggregate data (not just latest values), or even combine first AND last item in the same query, e.g.

SELECT
    -- Overall totals
    video_category,
    COUNT(1) AS videos_in_category,
    DATEDIFF(MAX(video_date), MIN(video_date)) AS timespan,
    
    -- Last video details
    MAX(video_id) AS last_video_id,
    SUBSTRING(MAX(CONCAT(LPAD(video_id, 11, '0'), video_url)), 12) AS last_video_url,
    ...
    
    -- First video details
    MIN(video_id) AS first_video_id,
    SUBSTRING(MIN(CONCAT(LPAD(video_id, 11, '0'), video_url)), 12) AS first_video_url,
    ...
    
    -- And so on

For further details explaining the benefits of this method vs other older methods, my full blog post is here: https://www.stevenmoseley.com/blog/tech/high-performance-sql-correlated-scalar-aggregate-reduction-queries

回复收藏 0 原文

攒眉千度 2024-12-03 11:16:39

一个稍微“质朴”的解决方案，但应该完成相同的工作：

SELECT
  video_category,
  video_url,
  video_date,
  video_title,
  short_description,
  video_id
FROM
  videos
ORDER BY video_id DESC
LIMIT 1;

换句话说，只需生成一个包含所需所有列的表，对其进行排序，使最大值位于顶部，然后将其截断所以你只返回一行。

A slightly more "rustic" solution, but should do the job just the same:

SELECT
  video_category,
  video_url,
  video_date,
  video_title,
  short_description,
  video_id
FROM
  videos
ORDER BY video_id DESC
LIMIT 1;

In other words, just produce a table with all of the columns that you want, sort it so that your maximum value is at the top, and chop it off so you only return one row.

回复收藏 0 原文

徒留西风 2024-12-03 11:16:39

这是一个更通用的解决方案（处理重复项）

CREATE TABLE test(
  i INTEGER,
  c INTEGER,
  v INTEGER
);


insert into test(i, c, v)
values
(3, 1, 1),
(3, 2, 2),
(3, 3, 3),
(4, 2, 4),
(4, 3, 5),
(4, 4, 6),
(5, 3, 7),
(5, 4, 8),
(5, 5, 9),
(6, 4, 10),
(6, 5, 11),
(6, 6, 12);



SELECT t.c, t.v
FROM test t
JOIN (SELECT test.c, max(i) as mi FROM test GROUP BY c) j ON
  t.i = j.mi AND
  t.c  = j.c
ORDER BY c;

Here is a more general solution (handles duplicates)

CREATE TABLE test(
  i INTEGER,
  c INTEGER,
  v INTEGER
);


insert into test(i, c, v)
values
(3, 1, 1),
(3, 2, 2),
(3, 3, 3),
(4, 2, 4),
(4, 3, 5),
(4, 4, 6),
(5, 3, 7),
(5, 4, 8),
(5, 5, 9),
(6, 4, 10),
(6, 5, 11),
(6, 6, 12);



SELECT t.c, t.v
FROM test t
JOIN (SELECT test.c, max(i) as mi FROM test GROUP BY c) j ON
  t.i = j.mi AND
  t.c  = j.c
ORDER BY c;

回复收藏 0 原文