MySQL 如何对一定范围内的时间戳行进行求和?

发布于 11-26 04:57 字数 2776 浏览 2 评论 0原文

给定一个带有时间戳列的表,例如:

    timestamp           |  id  |  value  
    --------------------------------------
    2001-01-01 00:00:00 |  1   |  3
    2001-01-01 00:00:00 |  2   |  5
    --------------------------------------
    2001-01-02 00:00:00 |  1   |  6
    2001-01-02 00:00:00 |  2   |  10
    2001-01-02 00:00:00 |  3   |  7
    --------------------------------------
    2001-01-03 00:00:00 |  3   |  14
    2001-01-03 00:00:00 |  2   |  15
    --------------------------------------
    2001-01-03 00:00:00 |  1   |  9
    2001-01-03 00:00:00 |  2   |  20

和给定的聚合级别,例如2天,我想聚合(求和)结果:

(1)给定聚合级别的移动窗口,对于上面的例子:2001-01-01到2001-01-02、2001-01-02到2001-01-03、2001-01-03到2001- 01-04 这将导致:

    timestamp_1         | timestamp_2         |  id  |  agg_value  
    -----------------------------------------------------------
    2001-01-01 00:00:00 | 2001-01-02 00:00:00 |  1   |  9 (=3+6)
    2001-01-01 00:00:00 | 2001-01-02 00:00:00 |  2   |  15 (=5+10)
    2001-01-01 00:00:00 | 2001-01-02 00:00:00 |  3   |  7 (=7)
    -----------------------------------------------------------
    2001-01-02 00:00:00 | 2001-01-03 00:00:00 |  1   |  6 (=6)
    2001-01-02 00:00:00 | 2001-01-03 00:00:00 |  2   |  25 (=10+15)
    2001-01-02 00:00:00 | 2001-01-03 00:00:00 |  3   |  21 (=7+14)
    -----------------------------------------------------------
    2001-01-03 00:00:00 | 2001-01-04 00:00:00 |  1   |  9 (=9)
    2001-01-03 00:00:00 | 2001-01-04 00:00:00 |  2   |  35 (=15+20)
    2001-01-03 00:00:00 | 2001-01-04 00:00:00 |  3   |  14 (=14)

(2) 非重叠划分到给定范围,对于上面的示例: 2001-01-01 到 2001-01-02、2001-01-03 到 2001-01-04,这将导致:(

    timestamp_1         | timestamp_2         |  id  |  agg_value  
    -----------------------------------------------------------

    2001-01-01 00:00:00 | 2001-01-02 00:00:00 |  1   |  9 (=3+6)
    2001-01-01 00:00:00 | 2001-01-02 00:00:00 |  2   |  15 (=5+10)
    2001-01-01 00:00:00 | 2001-01-02 00:00:00 |  3   |  7 (=7)
    -----------------------------------------------------------
    2001-01-03 00:00:00 | 2001-01-04 00:00:00 |  1   |  9 (=9)
    2001-01-03 00:00:00 | 2001-01-04 00:00:00 |  2   |  35 (=15+20)
    2001-01-03 00:00:00 | 2001-01-04 00:00:00 |  3   |  14 (=14)

基本上类似于(1),没有重叠)

谢谢!


编辑:添加解决方案

我至少有一个解决方案(1):(

    SELECT t1.timestamp AS timestamp1,
    MAX(t2.timestamp) AS timestamp2, t1.id,
    SUM(t2.value) AS agg_value
    FROM my_table t1
    LEFT JOIN my_table t2 ON
    (t2.timestamp >= t1.timestamp AND
    t2.timestamp <= ADDDATE(t1.timestamp,INTERVAL 2 DAY) AND
    t2.id = t1.id)
    GROUP BY t1.timestamp, t1.id

2)的解决方案可能只是过滤到上述的一个子集。

Given a table with a timestamp column, e.g.:

    timestamp           |  id  |  value  
    --------------------------------------
    2001-01-01 00:00:00 |  1   |  3
    2001-01-01 00:00:00 |  2   |  5
    --------------------------------------
    2001-01-02 00:00:00 |  1   |  6
    2001-01-02 00:00:00 |  2   |  10
    2001-01-02 00:00:00 |  3   |  7
    --------------------------------------
    2001-01-03 00:00:00 |  3   |  14
    2001-01-03 00:00:00 |  2   |  15
    --------------------------------------
    2001-01-03 00:00:00 |  1   |  9
    2001-01-03 00:00:00 |  2   |  20

and a given aggregation level, say 2 day, I would like to aggregate (sum) the results over:

(1) a moving window of the given agg-level, for the above example: 2001-01-01 to 2001-01-02, 2001-01-02 to 2001-01-03, 2001-01-03 to 2001-01-04 which will result in:

    timestamp_1         | timestamp_2         |  id  |  agg_value  
    -----------------------------------------------------------
    2001-01-01 00:00:00 | 2001-01-02 00:00:00 |  1   |  9 (=3+6)
    2001-01-01 00:00:00 | 2001-01-02 00:00:00 |  2   |  15 (=5+10)
    2001-01-01 00:00:00 | 2001-01-02 00:00:00 |  3   |  7 (=7)
    -----------------------------------------------------------
    2001-01-02 00:00:00 | 2001-01-03 00:00:00 |  1   |  6 (=6)
    2001-01-02 00:00:00 | 2001-01-03 00:00:00 |  2   |  25 (=10+15)
    2001-01-02 00:00:00 | 2001-01-03 00:00:00 |  3   |  21 (=7+14)
    -----------------------------------------------------------
    2001-01-03 00:00:00 | 2001-01-04 00:00:00 |  1   |  9 (=9)
    2001-01-03 00:00:00 | 2001-01-04 00:00:00 |  2   |  35 (=15+20)
    2001-01-03 00:00:00 | 2001-01-04 00:00:00 |  3   |  14 (=14)

(2) non overlapping division to the given range, for the above example: 2001-01-01 to 2001-01-02, 2001-01-03 to 2001-01-04, which will result in:

    timestamp_1         | timestamp_2         |  id  |  agg_value  
    -----------------------------------------------------------

    2001-01-01 00:00:00 | 2001-01-02 00:00:00 |  1   |  9 (=3+6)
    2001-01-01 00:00:00 | 2001-01-02 00:00:00 |  2   |  15 (=5+10)
    2001-01-01 00:00:00 | 2001-01-02 00:00:00 |  3   |  7 (=7)
    -----------------------------------------------------------
    2001-01-03 00:00:00 | 2001-01-04 00:00:00 |  1   |  9 (=9)
    2001-01-03 00:00:00 | 2001-01-04 00:00:00 |  2   |  35 (=15+20)
    2001-01-03 00:00:00 | 2001-01-04 00:00:00 |  3   |  14 (=14)

(which is basically like (1) without the overlap)

Thanks!


Edited: adding a solution

I have a solution at least for (1):

    SELECT t1.timestamp AS timestamp1,
    MAX(t2.timestamp) AS timestamp2, t1.id,
    SUM(t2.value) AS agg_value
    FROM my_table t1
    LEFT JOIN my_table t2 ON
    (t2.timestamp >= t1.timestamp AND
    t2.timestamp <= ADDDATE(t1.timestamp,INTERVAL 2 DAY) AND
    t2.id = t1.id)
    GROUP BY t1.timestamp, t1.id

A solution for (2) can probably be just filtering to a subset of the above.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

我恋#小黄人2024-12-03 04:57:40

这将通过获取日期部分并获取日期差异并按天数差异以及 ID 进行分组,每隔 X 天进行分组。这将得到你的#2解决方案

select 
       CEILING( datediff( date( now() ), date( myTimeStamp )) / 2 ) DaysDiff,
       ID,
       min( date( myTimeStamp )) as FirstDateInGroup,
       max( date( myTimeStamp )) as LastDateInGroup,
       sum( value ) as SumVal
   FROM TimeSample
   group by DaysDiff, ID
   order by FirstDateInGroup, ID

编辑----每条评论

你的样本展示了如何处理2天......这也是如此。 “now()”只是对数据进行分组的基线。如果您希望它按年份细分,那么我只需根据 YEAR( YourDateColumn ) 作为组进行查询。如果你想要 30 天,只需除以 30。每月,我会分别按年( YourDateColumn )和月( YourDateColumn )分组。通过具有固定的“now()”范围,它除了返回一个数字作为起点之外什么也不做。如果您的数据是 2 年前的,则日期差异将为 365 天 * 2 年 = 730 天...除以 2,然后以 365 为一组。您可以抛出任何您想要进一步限制时间的 where 子句您感兴趣的时期...

其中“2011-01-01”和“2011-06-30”之间的 myTimeStamp 可以获取今年的前 6 个月...结果是您的 DaysDiff 分组为 208 天 / 2 = 104。

因此,如果您有其他一些您想要进行分组的基线值,您只需将 now() 更改为“2011-01-01”之类的内容即可按2011年1月1日计算。其中,这只会将 DaysDiff 计算为负值直至零,然后返回正值。

This will group by every X days by getting the date portion and getting the date difference and grouping by the days difference, PLUS the ID. This will get your #2 solution

select 
       CEILING( datediff( date( now() ), date( myTimeStamp )) / 2 ) DaysDiff,
       ID,
       min( date( myTimeStamp )) as FirstDateInGroup,
       max( date( myTimeStamp )) as LastDateInGroup,
       sum( value ) as SumVal
   FROM TimeSample
   group by DaysDiff, ID
   order by FirstDateInGroup, ID

EDIT ---- PER COMMENTS

Your sample was showing how to handle for 2 days... so does this. The "now()" is just a baseline to group your data. If you want it broken down by years, then I would just do a query based on the YEAR( YourDateColumn ) as the group. If you want 30 days, just divide by 30. Monthly, I would group by year( YourDateColumn ) and month( YourDateColumn ) respectively. By having a fixed "now()" range, its doing nothing but returning a number as a starting point. If your data was 2 years old, the date difference would just be 365days * 2 years = 730 days... Divide by 2 and your back at a group basis of 365. You can throw whatever where clause you want to further limit the time period you are interested in...

where myTimeStamp between '2011-01-01' and '2011-06-30' to get just the first 6 months of this year... That would result with your DaysDiff grouping of 208 days / 2 = 104.

So, if you have some other baseline value you care to make your groupings, you can just change the now() to something like '2011-01-01' and it will compute based on Jan 1, 2011 basis. Of which, this will do nothing but compute the DaysDiff to negative values up to zero, then back to positive.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文