Msql:计算随时间的增长

发布于 2024-10-22 06:36:41 字数 599 浏览 1 评论 0原文

我几周前发布了这个问题,但我认为我没有清楚地提出这个问题,因为我得到的答案不是我想要的。我认为最好重新开始。

我正在尝试查询数据库以检索一段时间内唯一条目的数量。数据看起来像这样:

Day | 用户ID
1 |一个
1 | B
2 | B
3 |一个
4 | B
4 | C
5 | D

我希望查询结果在这个

时间跨度内看起来像这样 | COUNT(DISTINCT 用户 ID)
第 1 天到第 1 天 | 2
第 1 天到第 2 天 | 2
第 1 天到第 3 天 | 2
第 1 天到第 4 天 | 3
第 1 天到第 5 天 | 4

如果我执行类似的操作

SELECT COUNT(DISTINCT `UserID`) FROM `table` GROUP BY `Day`

,则不同计数将不会考虑前几天的用户 ID。

有什么想法吗?我使用的数据集非常大,因此多个查询和后处理需要很长时间(这就是我目前正在做的)。

谢谢

I posted about this a few weeks ago, but I don't think I asked the question clearly because the answers I got were not what I was looking for. I think it's best to start again.

I'm trying to query a database to retrieve the number of unique entries over time. The data looks something like this:

Day | UserID
1 | A
1 | B
2 | B
3 | A
4 | B
4 | C
5 | D

I'd like the query result to look this this

Time Span | COUNT(DISTINCT UserID)
Day 1 to Day 1 | 2
Day 1 to Day 2 | 2
Day 1 to Day 3 | 2
Day 1 to Day 4 | 3
Day 1 to Day 5 | 4

If I do something like

SELECT COUNT(DISTINCT `UserID`) FROM `table` GROUP BY `Day`

, the distinct counts will not consider user IDs of previous days.

Any Ideas? The data set I'm using is quite large, so multiple-queries and post processing takes a long time (that's how I'm currently doing it).

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

薄情伤 2024-10-29 06:36:41

您可以使用子查询

示例表

create table visits (day int, userid char(1));
insert visits values
(1,'a'),
(1,'b'),
(2,'b'),
(3,'a'),
(4,'b'),
(4,'c'),
(5,'d');

查询

select d.day, (select count(distinct userid) from visits where day<=d.day)
from (select distinct day from visits) d

You can use a subquery

Sample table

create table visits (day int, userid char(1));
insert visits values
(1,'a'),
(1,'b'),
(2,'b'),
(3,'a'),
(4,'b'),
(4,'c'),
(5,'d');

The query

select d.day, (select count(distinct userid) from visits where day<=d.day)
from (select distinct day from visits) d
许久 2024-10-29 06:36:41

像这样的事情怎么样:

SELECT Count(UserID), Day 
FROM     
    (SELECT Count(UserID) as Logons, UserID, Day 
    FROM yourDailyLog
    GROUP BY Day, UserID)
GROUP BY Day

内部选择应该消除同一用户在给定日期的重复访问。

远离 DISTINCT。对于几乎所有 SQL 问题来说,这通常都是一种值得怀疑的方法。

等等:我现在明白您希望时间段随着时间的推移而增加。这让事情变得有点棘手。为什么不在代码中聚合其余信息,而不是通过 sql 来完成这一切?

how about something like this:

SELECT Count(UserID), Day 
FROM     
    (SELECT Count(UserID) as Logons, UserID, Day 
    FROM yourDailyLog
    GROUP BY Day, UserID)
GROUP BY Day

The inner select should eliminate the duplicate visits by a same user on a given day.

Stay away from DISTINCT. It is usually a questionable approach to almost any SQL problem.

Wait: I see now that you want the time period to increase over time. That makes things a little trickier. Why don't you aggregate the rest of this information in code rather than doing it all through sql?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文