返回“评论最多”列表的单个 SQL 查询相对于过去 7 天,但列表始终必须包含某些内容

发布于 2024-11-16 18:15:29 字数 899 浏览 4 评论 0原文

起初看似简单的事情现在变得有点脑筋急转弯。

我正在构建您在新闻/博客网站右侧栏中看到的“评论最多”或“电子邮件最多”列表之一。

该列表必须与最近的日期范围(例如最近 7 天)相关,因为您希望列表始终看起来新鲜并包含较新的内容。否则,获得大量评论的少数文章将永远位居榜首。

例如,在我的例子中,该网站有几篇文章,每篇都有数百条评论,但其余的文章都少于 20 条。因此,如果像这样拉出“评论最多”列表:

select
  ArticleId,
  count(CommentId) as Comments
from
  Comment
group by
  ArticleId
order by
  count(CommentId) desc
limit 10;

那么具有数百条评论的几篇文章将始终处于顶部。该清单永远不会改变。

因此,我重写了选择以仅包含过去 7 天:

select
  ArticleId,
  count(CommentId) as Comments
from
  Comment
where
  Created >= '2011-06-14'
group by
  ArticleId
order by
  count(CommentId) desc
limit 10;

这看起来更好,但它仍然不起作用,因为它没有考虑过去 7 天内没有活动的情况。

如果过去 7 天内没有发生任何事情(或只发生了一些评论),则列表应包含在此之前的活动。底线 - 列表始终需要显示 10 篇文章。

我总是可以调用多个 SQL 语句,以 7 天为增量返回,直到填满列表,但我不想这样做。我将结果缓存在进程内内存中,但如果可能的话,我仍然希望只进行一次 SQL 调用。

想法?我觉得这很容易做到,我只是忽略了某个地方显而易见的事情。

What first seemed easy has now become a bit of a brain teaser.

I'm building one of those "Most Commented" or "Most Emailed" lists that you see in the right sidebar of news/blog websites.

The list has to be relative to a recent date range (say the last 7 days) because you want the list to always to look fresh and contain newer content. Otherwise, the few articles that got tons of comments will always sit atop the list.

For instance, in my case, the website has a few articles that have several hundred comments each but all the rest have less than 20. So if the "Most Commented" list was pulled like this:

select
  ArticleId,
  count(CommentId) as Comments
from
  Comment
group by
  ArticleId
order by
  count(CommentId) desc
limit 10;

Then the few articles with several hundred comments would always be at the top. The list would never change.

So I rewrote the select to include just the last 7 days:

select
  ArticleId,
  count(CommentId) as Comments
from
  Comment
where
  Created >= '2011-06-14'
group by
  ArticleId
order by
  count(CommentId) desc
limit 10;

That looks better, but it still won't work because it doesn't account for situations in which there is no activity in the last 7 days.

If nothing happened (or only a few comments happened) in the last 7 days, the list should contain activity from before that. Bottom line - the list always needs to show 10 articles.

I could always call multiple SQL statements, going back in 7 days increments until I fill the list, but I don't want to do that. I'm caching the results in in-process memory, but I still want to only make one SQL call if at all possible.

Thoughts? I feel like this is easily doable and I'm just overlooking the obvious somewhere.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

征﹌骨岁月お 2024-11-23 18:15:29

我建议计算每个结果的权重并按其排序。例如,您的体重可以是操作次数(评论、电子邮件等)/年龄(以天为单位)。这样,文章越旧,其权重就越低,除非它具有极高的行动率。

类似于:

select
  ArticleId,
  count(CommentId) / (1.0 * DATEDIFF(CURRENT_DATE, Created)) AS weight
from
  Comment
group by
  ArticleId
order by
  weight desc
limit 10;

您需要尝试使用权重公式,直到获得较新内容与稍旧但活跃的内容的正确组合。

I recommend computing a weight for each result and sort by that. For example, your weight could be the number of actions (comments, emails, etc) / age in days. That way, the older the article, the lower its weight unless it has an extremely high action rate.

Something like:

select
  ArticleId,
  count(CommentId) / (1.0 * DATEDIFF(CURRENT_DATE, Created)) AS weight
from
  Comment
group by
  ArticleId
order by
  weight desc
limit 10;

You'll want to play with your weight formula until you get the right mix of newer content with slightly older but active content.

我是有多爱你 2024-11-23 18:15:29

这是另一种方法,可以回答您所写的问题(我之前的回答建议采用不同的方法)。

这个将返回本周的顶部,如果需要填充前 10 名,则返回前几周的更多内容。它的工作原理是计算一列,该列表示文章的发布周数,然后按该列进行排序,然后按评论数进行排序。

这里的优点是,将始终返回 10 个结果(假设您有超过 10 篇文章),并将返回按年龄和评论排序的结果,最近几周的文章位于顶部,接下来是一周前的文章,等等。

select
  ArticleId,
  count(CommentId) as number_of_comments,
  DATEDIFF(CURRENT_DATE, Created) DIV 7 AS weeks_old
from
  Comment
group by
  ArticleId,
  weeks_old
order by
  weeks_old asc,
  number_of_comments desc
limit 10;

Here is another approach that answers your question as written (my previous answer suggests a different approach).

This one will return the top for this week and more from previous weeks if needed to fill the top 10. It works by computing a column that is the number of weeks old the article is and sorts by that and then the number of comments.

The advantage here is that will will always return 10 results (assuming you have more than 10 articles) and will return them sorted by age and comments with the most recent weeks articles at the top and the articles that are one week old next, etc.

select
  ArticleId,
  count(CommentId) as number_of_comments,
  DATEDIFF(CURRENT_DATE, Created) DIV 7 AS weeks_old
from
  Comment
group by
  ArticleId,
  weeks_old
order by
  weeks_old asc,
  number_of_comments desc
limit 10;
在巴黎塔顶看东京樱花 2024-11-23 18:15:29

我假设您还想限制记录数量。尝试如下操作:(注意:我使用 MSSQL,因此某些详细信息/语法对于 mysql 可能不正确...抱歉!)

select top 10
  ArticleId,
  case when Created>='2011-06-14' then 1 else 0 end as [isNew],
  count(CommentId) as Comments
from
  Comment
group by
  ArticleId,
  case when Created>='2011-06-14' then 1 else 0 end
order by
  isNew desc,
  Comments desc

I assume you also want to limit to a number of records. Try something like the following: (NOTE: I use MSSQL, so some details/syntax might be incorrect for mysql...sorry!)

select top 10
  ArticleId,
  case when Created>='2011-06-14' then 1 else 0 end as [isNew],
  count(CommentId) as Comments
from
  Comment
group by
  ArticleId,
  case when Created>='2011-06-14' then 1 else 0 end
order by
  isNew desc,
  Comments desc
静谧 2024-11-23 18:15:29

假设您想继续以 7 天的增量填充它:

SELECT
    C.article_id,
    COUNT(C.comment_id)
FROM
    Comments C
INNER JOIN Numbers N ON
    N.number >= 0 AND
    N.number <=4 AND
    C.created <= CURRENT_DATE - INTERVAL (N.number * 7) DAY AND
    C.created >= CURRENT_DATE - INTERVAL ((N.number + 1) * 7) DAY
GROUP BY
    C.article_id
ORDER BY
    N.number
    COUNT(C.comment_id)
LIMIT 10

我不使用 MySQL,因此您可能需要使用语法,尤其是日期函数。

另外,这使用了一个数字表,它只是一个从 1 到任何数字的整数表。也许 MySQL 有一个内置函数可以实现此目的,否则您将需要创建或生成表。在这种情况下,它最多只能返回四个星期,因此您甚至可以简单地使用 SELECT 1 UNION ALL SELECT 2...

最后,由于它只能返回四个星期,您仍然可能会得到少于 10 篇文章。您可以将周数增加到在您的业务案例中看起来合理的值。

Assuming that you want to keep filling it in by 7 day increments:

SELECT
    C.article_id,
    COUNT(C.comment_id)
FROM
    Comments C
INNER JOIN Numbers N ON
    N.number >= 0 AND
    N.number <=4 AND
    C.created <= CURRENT_DATE - INTERVAL (N.number * 7) DAY AND
    C.created >= CURRENT_DATE - INTERVAL ((N.number + 1) * 7) DAY
GROUP BY
    C.article_id
ORDER BY
    N.number
    COUNT(C.comment_id)
LIMIT 10

I don't work with MySQL, so you may need to play with the syntax, especially with the date functions.

Also, this uses a Numbers table, which is just a table of numeric integers from 1 to whatever. Maybe MySQL has a built-in function for this, otherwise you would need to create or generate the table. It only goes back up to four weeks in this case, so you could even simply use a SELECT 1 UNION ALL SELECT 2...

Finally, since it only goes back four weeks, you could still end up with less than 10 articles. You can increase the number of weeks to something that seems reasonable in your business case.

旧时光的容颜 2024-11-23 18:15:29

我可能遗漏了一些东西,但你不能将你已经拥有的东西结合起来吗?

我不知道 mySQL 语法,所以你必须转换。

select top 10 *
from 
(
select ArticleId, count(CommentId) as Comments 
from  Comment 
where   Created >= '2011-06-14' 
group by   ArticleId 
union
select ArticleId, count(CommentId) as Comments 
from Comment 
group by  ArticleId 
) 
order by comments desc

I may be missing something but couldn't you Union what you already have?

I don't know the mySQL syntax so you will have to convert.

select top 10 *
from 
(
select ArticleId, count(CommentId) as Comments 
from  Comment 
where   Created >= '2011-06-14' 
group by   ArticleId 
union
select ArticleId, count(CommentId) as Comments 
from Comment 
group by  ArticleId 
) 
order by comments desc
莫言歌 2024-11-23 18:15:29

这将为您提供根据文章收到的评论总数排序的最近评论的文章。

SELECT ArticleId, COUNT(CommentId) CommentsCount FROM Comment WHERE ArticleID IN
(
select DISTINCT ArticleId FROM Comment ORDER BY Created limit 10
) ORDER BY CommentsCount DESC

希望这更接近你想要的。如果您需要准确的查询,即仅在 CommentsCount 字段中包含最近几周的评论,请告诉我:-)

This will give you the recently commented articles sorted based on the total comments received by an article.

SELECT ArticleId, COUNT(CommentId) CommentsCount FROM Comment WHERE ArticleID IN
(
select DISTINCT ArticleId FROM Comment ORDER BY Created limit 10
) ORDER BY CommentsCount DESC

Hope that is closer to what you want. If you need the accurate query i.e. include only the last weeks comment in the CommentsCount field, please let me know :-)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文