当前位置：文江博客话题详情

Mysql统计多个表中按天分组的记录

发布于 2024-12-24 03:42:20 字数 603 浏览 2 评论 0原文

我的数据库有新闻文章和博客文章。两者的主键都是在两个表中唯一的 ItemID。

的表中

文章位于具有以下字段item_id
title
body
date_posted

blogposts 表具有以下字段

item_id
title
body
date_posted

两个表都具有各自特有的额外字段。

我有第三个表，其中保存有关文章和帖子的元信息。

items 表具有以下字段

item_id
source_id
...

每篇博文和文章在 items 表中都有一条记录，并在其各自的表中都有一条记录。

我想做的是构建一个查询来计算每天发布的项目数。我可以使用按 date_posted 分组的计数对一个表执行此操作，但如何在一个查询中合并文章和帖子计数？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

執念 2024-12-31 03:42:20

与民主党类似，但稍微简单一些：

select date_posted, count(*)
from (select date_posted from article union all
      select date_posted from blogposts) v
group by date_posted

Similar to Dems, but slightly simpler:

select date_posted, count(*)
from (select date_posted from article union all
      select date_posted from blogposts) v
group by date_posted

回复收藏 0 原文

晨曦慕雪 2024-12-31 03:42:20

您可以通过两种方式做到这一点。
1. 将所有内容连接在一起，然后聚合（参见 Tom H 的回答）。
2. 聚合每个表，对它们进行 UNION，然后再次聚合。

选项 1 可能看起来更短，但意味着您可能无法从根表上的索引中受益（因为必须为 JOIN 重新排序它们）。所以我将展示选项 2，这是您无论如何都要去的方向。

SELECT
  date_posted,
  SUM(daily_count) AS daily_count
FROM
  (
   SELECT date_posted, COUNT(*) AS daily_count FROM article   GROUP BY date_posted
   UNION ALL
   SELECT date_posted, COUNT(*) AS daily_count FROM blogposts GROUP BY date_posted
  )
  AS combined
GROUP BY
  date_posted

这应该是最快的，前提是您在每个表上都有一个索引，其中 date_posted 是索引中的first 字段。否则，表仍需要重新排序以进行聚合。

You can do it two ways.
1. Join everything together and then aggregate (See Tom H's answer).
2. Aggregate each table, UNION them, and aggregate again.

Option 1 may seem shorter, but will mean that you may not benefit from INDEXes on the root tables (As they have to be re-ordered for the JOIN). So I'll show option 2, which is the direction you were headed any way.

SELECT
  date_posted,
  SUM(daily_count) AS daily_count
FROM
  (
   SELECT date_posted, COUNT(*) AS daily_count FROM article   GROUP BY date_posted
   UNION ALL
   SELECT date_posted, COUNT(*) AS daily_count FROM blogposts GROUP BY date_posted
  )
  AS combined
GROUP BY
  date_posted

This should be fastest, provided that you have an index on each table where date_posted is the first field in the index. Other-wise the tables will still need to be re-ordered for the aggregation.

回复收藏 0 原文

时光磨忆 2024-12-31 03:42:20

我会为此使用不同的表设计，包括类型和子类型。您的 Items 表具有单列主键，Blog_Posts 和 Articles 表的主键与 Items 表的外键具有相同的 ID。这将使类似的事情变得非常容易，并且还有助于确保数据完整性。

根据您现有的设计，您最好的选择可能是这样的：

SELECT
    I.item_id,
    I.source_id,
    COALESCE(A.date_posted, B.date_posted) AS date_posted,
    COUNT(*) AS date_count
FROM
    Items I
LEFT OUTER JOIN Articles A ON
    A.item_id = I.item_id AND
    I.source_id = 'A'  -- Or whatever the Articles ID is
LEFT OUTER JOIN Blog_Posts B ON
    B.item_id = I.item_id AND
    I.source_id = 'B'  -- Or whatever the Blog_Posts ID is
GROUP BY
    I.item_id,
    I.source_id,
    COALESCE(A.date_posted, B.date_posted)

您还可以尝试使用 UNION：

SELECT
    SQ.item_id,
    SQ.source_id,
    SQ.date_posted,
    COUNT(*) AS date_count
FROM
    (
        SELECT I1.item_id, I1.source_id, A.date_posted
        FROM Items I1
        INNER JOIN Articles A ON A.item_id = I1.item_id
        WHERE I1.source_id = 'A'
        UNION ALL
        SELECT I2.item_id, I2.source_id, B.date_posted
        FROM Items I2
        INNER JOIN Articles B ON B.item_id = I2.item_id
        WHERE I2.source_id = 'B'
    )

I would have used a different table design for this, with types and subtypes. Your Items table has a single column primary key and your Blog_Posts and Articles tables' primary keys are the same ID with a foreign key to the Items table. That would make something like this pretty easy to do and also helps to ensure data integrity.

With your existing design, your best bet is probably something like this:

SELECT
    I.item_id,
    I.source_id,
    COALESCE(A.date_posted, B.date_posted) AS date_posted,
    COUNT(*) AS date_count
FROM
    Items I
LEFT OUTER JOIN Articles A ON
    A.item_id = I.item_id AND
    I.source_id = 'A'  -- Or whatever the Articles ID is
LEFT OUTER JOIN Blog_Posts B ON
    B.item_id = I.item_id AND
    I.source_id = 'B'  -- Or whatever the Blog_Posts ID is
GROUP BY
    I.item_id,
    I.source_id,
    COALESCE(A.date_posted, B.date_posted)

You could also try using a UNION:

SELECT
    SQ.item_id,
    SQ.source_id,
    SQ.date_posted,
    COUNT(*) AS date_count
FROM
    (
        SELECT I1.item_id, I1.source_id, A.date_posted
        FROM Items I1
        INNER JOIN Articles A ON A.item_id = I1.item_id
        WHERE I1.source_id = 'A'
        UNION ALL
        SELECT I2.item_id, I2.source_id, B.date_posted
        FROM Items I2
        INNER JOIN Articles B ON B.item_id = I2.item_id
        WHERE I2.source_id = 'B'
    )

回复收藏 0 原文

烟柳画桥 2024-12-31 03:42:20

select item_id, date_posted from blogposts where /* some conditions */
union all select item_id, date_posted from articles where /* some conditions */

您可能需要将其放入子查询中，如果您愿意，可以在运行 group by 时将其与其他表连接起来。但要点是，union 是用于组合来自不同表的类似数据的运算符。 union all 告诉数据库您不需要它来合并重复记录，因为您知道两个表永远不会共享 item_id，因此它会更快一些（可能）。

select item_id, date_posted from blogposts where /* some conditions */
union all select item_id, date_posted from articles where /* some conditions */

You'll probably need to put that into a subquery, and if you so desire, join it with other tables, when running the group by. But the main point is that union is the operator you use to combine like data from different tables. union all tells the database that you don't need it to combine duplicate records, since you know that the two tables will never share an item_id, so it's a little faster (probably).

回复收藏 0 原文

~没有更多了~