MySQL 排序先于分组依据

发布于 2024-10-19 22:56:02 字数 345 浏览 5 评论 0原文

我需要找到每个作者的最新帖子,然后对结果进行分组,这样我就只能为每个作者找到一篇最新帖子。

SELECT wp_posts.* FROM wp_posts
        WHERE wp_posts.post_status='publish'
        AND wp_posts.post_type='post'
        GROUP BY wp_posts.post_author           
        ORDER BY wp_posts.post_date DESC

这是正确地对输出进行分组,因此每个作者只能得到一篇文章,但它是在分组之后而不是在选择之前对结果进行排序的。

I need to find the latest post for each author and then group the results so I only a single latest post for each author.

SELECT wp_posts.* FROM wp_posts
        WHERE wp_posts.post_status='publish'
        AND wp_posts.post_type='post'
        GROUP BY wp_posts.post_author           
        ORDER BY wp_posts.post_date DESC

This is correctly grouping the output so I only get one post per author, but it is ordering the results after they have been grouped and not before they have been selected.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

时间海 2024-10-26 22:56:02

从 wp_posts 中选择 wp_posts.*

其中 wp_posts.post_status='publish' 和 wp_posts.post_type='post'

按 wp_posts.post_author 分组

having wp_posts.post_date = MAX(wp_posts.post_date) /* 仅每个作者的最后一篇文章 */

order by wp_posts.post_date desc


编辑:

经过一些评论,我决定添加一些附加信息。

我工作的公司也使用 Postgres,尤其是 SQL Server。该数据库不允许此类查询。所以我知道还有另一种方法可以做到这一点(我在下面写了一个解决方案)。您还应该知道如果您不按投影中处理的所有列进行分组或使用聚合函数,您会做什么。不然就这样吧!

我选择了上面的解决方案,因为这是一个具体的问题。 Tom 希望获取 WordPress 网站上每个作者的最新帖子。在我看来,如果作者每秒发表的帖子超过一篇,那么分析的效果可以忽略不计。 Wordpress 甚至应该通过垃圾邮件双帖子检测来禁止它。我从个人经验中知道,使用 MySQL 进行这样的脏组操作在性能上确实有显着的好处。但如果你知道自己在做什么,那么你就能做到!我在应用程序中有如此肮脏的群体,而我对这些群体负有专业责任。这里我有一些包含 mio 行的表,需要 5-15 秒而不是 100++ 秒。

对于一些优点和缺点可能有用: http://ftp.nchu.edu.tw/MySQL/tech-resources/articles/debunking-group-by-myths.html


SELECT
    wp_posts.*
FROM 
    wp_posts
    JOIN 
    (
        SELECT
            g.post_author
            MAX(g.post_date) AS post_date
        FROM wp_posts as g
        WHERE
            g.post_status='publish'
            AND g.post_type='post'
        GROUP BY g.post_author
    ) as t 
    ON wp_posts.post_author = t.post_author AND wp_posts.post_date = t.post_date

ORDER BY wp_posts.post_date

但是,如果这里的作者每秒超过一篇文章,您将得到不止一行,而不是唯一的最后一行

现在您可以再次旋转轮子并获取具有最高 Id 的帖子。即使在这里,至少也不能保证你真的得到最后一张。

select wp_posts.* from wp_posts

where wp_posts.post_status='publish'and wp_posts.post_type='post'

group by wp_posts.post_author

having wp_posts.post_date = MAX(wp_posts.post_date) /* ONLY THE LAST POST FOR EACH AUTHOR */

order by wp_posts.post_date desc


EDIT:

After some comments I have decided to add some additional informations.

The company I am working at also uses Postgres and especially SQL Server. This databases don't allow such queries. So I know that there is a other way to do this (I write a solution below). You shoud also have to know what you do if you don't group by all columns treated in the projection or use aggregate functions. Otherwise let it be!

I chose the solution above, because it's a specific question. Tom want to get the recent post for each author in a wordpress site. In my mind it is negligible for the analysis if a author do more than one post per second. Wordpress should even forbid it by its spam-double-post detection. I know from personal experience that there is a really significant benefit in performance doing a such dirty group by with MySQL. But if you know what you do, then you can do it! I have such dirty groups in apps where I'm professionally accountable for. Here I have tables with some mio rows which need 5-15s instead of 100++ seconds.

May be useful about some pros and cons: http://ftp.nchu.edu.tw/MySQL/tech-resources/articles/debunking-group-by-myths.html


SELECT
    wp_posts.*
FROM 
    wp_posts
    JOIN 
    (
        SELECT
            g.post_author
            MAX(g.post_date) AS post_date
        FROM wp_posts as g
        WHERE
            g.post_status='publish'
            AND g.post_type='post'
        GROUP BY g.post_author
    ) as t 
    ON wp_posts.post_author = t.post_author AND wp_posts.post_date = t.post_date

ORDER BY wp_posts.post_date

But if here is more then one post per second for a author you will get more then one row and not the only last one.

Now you can spin the wheel again and get the post with the highest Id. Even here it is at least not guaranteed that you really get the last one.

忘年祭陌 2024-10-26 22:56:02

不确定我是否正确理解您的要求,但以下内部语句获取每个作者的最新 post_date 列表,并将它们与 wp_posts 表连接起来以获得完整的记录。

SELECT  *
FROM    wp_posts wp
        INNER JOIN (
          SELECT  post_author
                  , MAX(post_date) AS post_date
          FROM    wp_posts
          WHERE   post_status = 'publish'
                  AND post_type = 'post'
          GROUP BY
                  post.author
        ) wpmax ON wpmax.post_author = wp.post_author
                   AND wpmax.post_date = wp.post_date
ORDER BY
        wp.post_date DESC

Not sure if I understand your requirement correct but following inner statement gets the list of the latest post_date for each author and joins these back with the wp_posts table to get a complete record.

SELECT  *
FROM    wp_posts wp
        INNER JOIN (
          SELECT  post_author
                  , MAX(post_date) AS post_date
          FROM    wp_posts
          WHERE   post_status = 'publish'
                  AND post_type = 'post'
          GROUP BY
                  post.author
        ) wpmax ON wpmax.post_author = wp.post_author
                   AND wpmax.post_date = wp.post_date
ORDER BY
        wp.post_date DESC
美人骨 2024-10-26 22:56:02

我认为@edze 的回应是错误的。

MySQL 手册中,您可以阅读:

MySQL扩展了GROUP BY的使用,使得选择列表可以引用
未在 GROUP BY 子句中命名的非聚合列。你可以使用
此功能可以通过避免不必要的列来获得更好的性能
排序和分组。然而,这主要在所有情况下有用
GROUP BY 中未命名的每个非聚合列中的值是
每组都一样。服务器可以自由选择每个值
组,因此除非它们相同,否则选择的值是
不定。此外,从每组中选择值
不会受到添加 ORDER BY 子句的影响。的排序
结果集在选择值后出现,而 ORDER BY 则不会
不影响服务器选择哪些值。

两个很好的参考:

抱歉,但由于我的声誉,我无法评论 @edze 响应,所以我写了一个新的回答。

I think that @edze response is wrong.

In the MySQL manual you can read:

MySQL extends the use of GROUP BY so that the select list can refer to
nonaggregated columns not named in the GROUP BY clause. You can use
this feature to get better performance by avoiding unnecessary column
sorting and grouping. However, this is useful primarily when all
values in each nonaggregated column not named in the GROUP BY are the
same for each group. The server is free to choose any value from each
group, so unless they are the same, the values chosen are
indeterminate. Furthermore, the selection of values from each group
cannot be influenced by adding an ORDER BY clause. Sorting of the
result set occurs after values have been chosen, and ORDER BY does
not affect which values the server chooses.

Two great references:

Sorry, but I can not comment the @edze response because of my reputation, so I have written a new answer.

往日 2024-10-26 22:56:02

通过使用 GROUP BY 包装查询,在 ORDER BY 之后执行 GROUP BY,如下所示:

SELECT t.* FROM (SELECT * FROM table ORDER BY time DESC) t GROUP BY t.author

Do a GROUP BY after the ORDER BY by wrapping your query with the GROUP BY like this:

SELECT t.* FROM (SELECT * FROM table ORDER BY time DESC) t GROUP BY t.author
┼── 2024-10-26 22:56:02

在 group-statement 之前或之后排序并不重要,因为 order 只意味着 213 到 123 或 321,而不是更多。 group by 每列仅包含一些条目,而不仅仅是最新的条目。我认为你在这里使用子选择就像

SELECT wp_posts.* FROM wp_posts
        WHERE wp_posts.post_status='publish'
        AND wp_posts.post_type='post'
        AND wp_posts.post_date = (Select max(post_date) from wp_posts where author = ... )

it doesn't matter if you order before or after the group-statement, because order means only that 213 goes to 123 or 321 and not more. group by takes only SOME entry per column, not only the latest. I consider you working with subselects here like

SELECT wp_posts.* FROM wp_posts
        WHERE wp_posts.post_status='publish'
        AND wp_posts.post_type='post'
        AND wp_posts.post_date = (Select max(post_date) from wp_posts where author = ... )
愛上了 2024-10-26 22:56:02

对此你有何看法?似乎对我有用

SELECT wp_posts.post_author, MAX(wp_posts.post_date), wp_posts.status, wp_posts.post_type
FROM wp_posts
    WHERE wp_posts.post_status='publish'
    AND wp_posts.post_type='post'
    GROUP BY wp_posts.post_author

它给我带来了所有作者的最新post_date ...你发现那里有问题吗?我不

What do you think about this?? Seems to work for me

SELECT wp_posts.post_author, MAX(wp_posts.post_date), wp_posts.status, wp_posts.post_type
FROM wp_posts
    WHERE wp_posts.post_status='publish'
    AND wp_posts.post_type='post'
    GROUP BY wp_posts.post_author

It brings me all the Authors with the most updated post_date ... Do you identify a problem there?? I don't

污味仙女 2024-10-26 22:56:02
    SELECT wp_posts.*,max(wp_posts.post_date) FROM wp_posts
    WHERE wp_posts.post_status='publish'
    AND wp_posts.post_type='post'
    GROUP BY wp_posts.post_author 
    SELECT wp_posts.*,max(wp_posts.post_date) FROM wp_posts
    WHERE wp_posts.post_status='publish'
    AND wp_posts.post_type='post'
    GROUP BY wp_posts.post_author 
橘亓 2024-10-26 22:56:02

当我们的表变大时,还需要检查性能。
我在这里检查了问题中的所有选项,使用具有 136K 消息的 PM 系统和具有 83K 行的链接表。

当您只需要计数或只需要 ID 时 - Alex 的解决方案是最好的。

SELECT wp_posts.post_author, MAX(wp_posts.post_date), wp_posts.status, wp_posts.post_type
FROM wp_posts
    WHERE wp_posts.post_status='publish'
    AND wp_posts.post_type='post'
    GROUP BY wp_posts.post_author

当您需要其他字段时,我需要修改 Husky110 解决方案(对于我的表设计 - 这里只是示例 - 未选中),在我的表中比子查询选项快 10 倍:

SELECT wp_posts.* FROM wp_posts,
    (Select post_id as pid,  max(post_date) maxdate from wp_posts where author = ... group by author order by maxdate  desc limit 4) t
    WHERE wp_posts.post_status='publish'
    AND wp_posts.post_type='post'
    AND wp_posts.post_id = pid

此更改可以选择多个帖子(一个用于用户,例如),并且可以修改为其他解决方案。

摩西。

When our table became large, performance need to checked also.
I checked all the options in the questions here, with a PM system with a 136K messages and link table with 83K rows.

When you need only count, or only IDs - Alex's solution is the best.

SELECT wp_posts.post_author, MAX(wp_posts.post_date), wp_posts.status, wp_posts.post_type
FROM wp_posts
    WHERE wp_posts.post_status='publish'
    AND wp_posts.post_type='post'
    GROUP BY wp_posts.post_author

When you need other fields, I need to modify Husky110 solution (to my table design - here it is only example - not checked), that in my tables 10x faster than the subquery option:

SELECT wp_posts.* FROM wp_posts,
    (Select post_id as pid,  max(post_date) maxdate from wp_posts where author = ... group by author order by maxdate  desc limit 4) t
    WHERE wp_posts.post_status='publish'
    AND wp_posts.post_type='post'
    AND wp_posts.post_id = pid

This change can select more than one post (one for user, for example), and can be modified to other solutions.

Moshe.

纸伞微斜 2024-10-26 22:56:02

使用下面的代码...

<?php
//get all users, iterate through users, query for one post for the user,
//if there is a post then display the post title, author, content info
$blogusers = get_users_of_blog();
if ($blogusers) {
  foreach ($blogusers as $bloguser) {
    $args = array(
    'author' => $bloguser->user_id,
      'showposts' => 1,
      'caller_get_posts' => 1
    );
    $my_query = new WP_Query($args);
    if( $my_query->have_posts() ) {
      // $user = get_userdata($bloguser->user_id);
      // echo 'This is one post for author with User ID: ' . $user->ID . ' ' . $user-    >user_firstname . ' ' . $user->user_lastname;
      while ($my_query->have_posts()) : $my_query->the_post(); ?>
        <a href="<?php the_permalink() ?>" rel="bookmark" title="Permanent Link to <?    php the_title_attribute(); ?>"><?php the_title(); ?></a>

        <small><?php the_time('F jS, Y') ?> by <?php the_author_posts_link() ?>     </small><?php
        the_content();
      endwhile;
    }
  }
}
?>

Use the below code...

<?php
//get all users, iterate through users, query for one post for the user,
//if there is a post then display the post title, author, content info
$blogusers = get_users_of_blog();
if ($blogusers) {
  foreach ($blogusers as $bloguser) {
    $args = array(
    'author' => $bloguser->user_id,
      'showposts' => 1,
      'caller_get_posts' => 1
    );
    $my_query = new WP_Query($args);
    if( $my_query->have_posts() ) {
      // $user = get_userdata($bloguser->user_id);
      // echo 'This is one post for author with User ID: ' . $user->ID . ' ' . $user-    >user_firstname . ' ' . $user->user_lastname;
      while ($my_query->have_posts()) : $my_query->the_post(); ?>
        <a href="<?php the_permalink() ?>" rel="bookmark" title="Permanent Link to <?    php the_title_attribute(); ?>"><?php the_title(); ?></a>

        <small><?php the_time('F jS, Y') ?> by <?php the_author_posts_link() ?>     </small><?php
        the_content();
      endwhile;
    }
  }
}
?>
白首有我共你 2024-10-26 22:56:02

这里有一个简单的答案
http://www.cafewebmaster.com/mysql-order-sort-group

SELECT * FROM 
(
select * from `my_table` order by timestamp desc
) as my_table_tmp

GROUP BY catid
ORDER BY nid desc

这对我来说有奇迹

HERE a simple answer from
http://www.cafewebmaster.com/mysql-order-sort-group

SELECT * FROM 
(
select * from `my_table` order by timestamp desc
) as my_table_tmp

GROUP BY catid
ORDER BY nid desc

it worked wonders for me

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文