SQL Server中的窗口移动平均线

发布于 2024-10-25 15:48:23 字数 1473 浏览 7 评论 0原文

我正在尝试创建一个计算 SQLServer 2008 中的窗口移动平均值的函数。我对 SQL 很陌生,所以我遇到了一些困难。我尝试执行移动平均的数据需要按天分组(都是带时间戳的数据),然后需要对其应用可变移动平均窗口。

我已经有一个按天(和@id)对数据进行分组的函数,该函数显示在底部。我有几个问题:

在移动平均函数内调用分组函数更好还是应该一次完成所有操作?

是否可以获取日期的移动平均值输入到函数中,但返回 n 天开始移动平均线,以便返回数据的前 n 天的平均值不会为 0?(即,如果他们想要从 2011 年 8 月 1 日到 2011 年 8 月 2 日的 7 天移动平均线,那么我在 2011 年 1 月 1 日开始移动平均线计算,以便他们定义的第一天有一个值?

)我正在研究如何进行移动平均线,并且知道移动窗口似乎是最佳选择(currentSum = prevSum + TodayCount - nthDayAgoCount) / nDays但我仍在研究弄清楚这个的 SQL 实现。

我有一个如下所示的分组函数(为了可见性而删除了一些变量):

    SELECT
        'ALL' as GeogType,
        CAST(v.AdmissionOn as date) as dtAdmission,    
        CASE WHEN @id IS NULL THEN 99 ELSE v.ID END,
        COUNT(*) as nVisits
    FROM dbo.Table1 v INNER JOIN dbo.Table2 t ON v.FSLDU = t.FSLDU5
    WHERE v.AdmissionOn >= '01-01-2010' AND v.AdmissionOn < DATEADD(day,1,'02-01-2010') 
          AND v.ID = Coalesce(@id,ID)
    GROUP BY    
        CAST(v.AdmissionOn as date),
        CASE WHEN @id IS NULL THEN 99 ELSE v.ID END
    ORDER BY 2,3,4

它返回一个像这样的表:

ALL 2010-01-01  1   103
ALL 2010-01-02  1   114
ALL 2010-01-03  1   86
ALL 2010-01-04  1   88
ALL 2010-01-05  1   84
ALL 2010-01-06  1   87
ALL 2010-01-07  1   82

编辑:要回答我问的第一个问题:

我最终创建了一个声明的函数一个临时表,并将 count 函数的结果插入其中,然后使用 user662852 中的示例来计算移动平均值。

I am trying to create a function that computes a windowed moving average in SQLServer 2008. I am quite new to SQL so I am having a fair bit of difficulty. The data that I am trying to perform the moving average on needs to be grouped by day (it is all timestamped data) and then a variable moving average window needs to be applied to it.

I already have a function that groups the data by day (and @id) which is shown at the bottom. I have a few questions:

Would it be better to call the grouping function inside the moving average function or should I do it all at once?

Is it possible to get the moving average for the dates input into the function, but go back n days to begin the moving average so that the first n days of the returned data will not have 0 for their average? (ie. if they want a 7 day moving average from 01-08-2011 to 02-08-2011 that I start the moving average calculation on 01-01-2011 so that the first day they defined has a value?)

I am in the process of looking into how to do the moving average, and know that a moving window seems to be the best option (currentSum = prevSum + todayCount - nthDayAgoCount) / nDays but I am still working on figuring out the SQL implementation of this.

I have a grouping function that looks like this (some variables removed for visibility purposes):

    SELECT
        'ALL' as GeogType,
        CAST(v.AdmissionOn as date) as dtAdmission,    
        CASE WHEN @id IS NULL THEN 99 ELSE v.ID END,
        COUNT(*) as nVisits
    FROM dbo.Table1 v INNER JOIN dbo.Table2 t ON v.FSLDU = t.FSLDU5
    WHERE v.AdmissionOn >= '01-01-2010' AND v.AdmissionOn < DATEADD(day,1,'02-01-2010') 
          AND v.ID = Coalesce(@id,ID)
    GROUP BY    
        CAST(v.AdmissionOn as date),
        CASE WHEN @id IS NULL THEN 99 ELSE v.ID END
    ORDER BY 2,3,4

Which returns a table like so:

ALL 2010-01-01  1   103
ALL 2010-01-02  1   114
ALL 2010-01-03  1   86
ALL 2010-01-04  1   88
ALL 2010-01-05  1   84
ALL 2010-01-06  1   87
ALL 2010-01-07  1   82

EDIT: To answer the first question I asked:

I ended up creating a function which declared a temporary table and inserted the results from the count function into it, then used the example from user662852 to compute the moving average.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

冰火雁神 2024-11-01 15:48:24

从查询中取出硬编码的日期范围。将输出(如最后的示例)写入临时表(我在下面将其称为#visits)。
尝试对临时表进行这种自连接:

 Select list.dtadmission
   , AVG(data.nvisits) as Avg
   , SUM(data.nvisits) as sum
   , COUNT(data.nvisits) as RollingDayCount
   , MIN(data.dtadmission) as Verifymindate
   , MAX(data.dtadmission)   as Verifymaxdate
 from  #visits as list 
 inner join #visits as data  
 on list.dtadmission between data.dtadmission and DATEADD(DD,6,data.dtadmission) group by list.dtadmission

编辑:我在评论中没有足够的空间来回答您的问题:

我的连接是“有点笛卡尔”,因为它在连接约束。列表中的每条记录都会与其他所有记录进行比较,然后我想要报告日期介于 (-7) 天和今天之间的记录。每个数据日期都可以列出日期,这是您问题的关键。我本来可以将连接条件写为

list.dtadmission between DATEADD(DD,-6,data.dtadmission) and data.dtadmission

但真正发生的情况是我测试了它,

list.dtadmission between DATEADD(DD,6,data.dtadmission) and data.dtadmission

它不返回任何记录,因为语法是“介于低和高之间”。我对 0 条记录进行了捂脸,交换了参数,仅此而已。

尝试以下操作,看看我的意思: 这是仅一个列表日期的笛卡尔连接:

 SELECT 
 list.[dtAdmission] as listdate
 ,data.[dtAdmission] as datadate
 ,data.nVisits as datadata
 ,DATEADD(dd,6,list.dtadmission) as listplus6 
 ,DATEADD(dd,6,data.dtAdmission ) as datapplus6 
 from  [sandbox].[dbo].[admAvg] as list inner join [sandbox].[dbo].[admAvg] as data    
 on 
 1=1
 where list.dtAdmission = '5-Jan-2011'

将其与实际连接条件进行比较

 SELECT 
      list.[dtAdmission] as listdate
      ,data.[dtAdmission] as datadate
      ,data.nVisits as datadata
      ,DATEADD(dd,6,list.dtadmission) as listplus6 
      ,DATEADD(dd,6,data.dtAdmission ) as datapplus6
from  [sandbox].[dbo].[admAvg] as list   inner join [sandbox].[dbo].[admAvg] as data    
on 
list.dtadmission between data.dtadmission and DATEADD(DD,6,data.dtadmission)
where list.dtAdmission = '5-Jan-2011'

看看所有记录中的列表日期在 datadate 和 dataplus6 之间如何?

Take the hardcoded date range out of your query. Write the output (like your sample at the end) to a temp table (I called it #visits below).
Try this self join to the temp table:

 Select list.dtadmission
   , AVG(data.nvisits) as Avg
   , SUM(data.nvisits) as sum
   , COUNT(data.nvisits) as RollingDayCount
   , MIN(data.dtadmission) as Verifymindate
   , MAX(data.dtadmission)   as Verifymaxdate
 from  #visits as list 
 inner join #visits as data  
 on list.dtadmission between data.dtadmission and DATEADD(DD,6,data.dtadmission) group by list.dtadmission

EDIT: I didn't have enough room in Comments to say this in response to your question:

My join is "kinda cartesian" because it uses a between in the join constraint. Each record in list is going up against every other record, and then I want the ones where the date I report is between a lower bound of (-7) days and today. Every data date is available to list date, this is the key to your question. I could have written the join condition as

list.dtadmission between DATEADD(DD,-6,data.dtadmission) and data.dtadmission

But what really happened was I tested it as

list.dtadmission between DATEADD(DD,6,data.dtadmission) and data.dtadmission

Which returns no records because the syntax is "Between LOW and HIGH". I facepalmed on 0 records and swapped the arguments, that's all.

Try the following, see what I mean: This is the cartesian join for just one listdate:

 SELECT 
 list.[dtAdmission] as listdate
 ,data.[dtAdmission] as datadate
 ,data.nVisits as datadata
 ,DATEADD(dd,6,list.dtadmission) as listplus6 
 ,DATEADD(dd,6,data.dtAdmission ) as datapplus6 
 from  [sandbox].[dbo].[admAvg] as list inner join [sandbox].[dbo].[admAvg] as data    
 on 
 1=1
 where list.dtAdmission = '5-Jan-2011'

Compare this to the actual join condition

 SELECT 
      list.[dtAdmission] as listdate
      ,data.[dtAdmission] as datadate
      ,data.nVisits as datadata
      ,DATEADD(dd,6,list.dtadmission) as listplus6 
      ,DATEADD(dd,6,data.dtAdmission ) as datapplus6
from  [sandbox].[dbo].[admAvg] as list   inner join [sandbox].[dbo].[admAvg] as data    
on 
list.dtadmission between data.dtadmission and DATEADD(DD,6,data.dtadmission)
where list.dtAdmission = '5-Jan-2011'

See how list date is between datadate and dataplus6 in all the records?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文