SQL Server中的窗口移动平均线
我正在尝试创建一个计算 SQLServer 2008 中的窗口移动平均值的函数。我对 SQL 很陌生,所以我遇到了一些困难。我尝试执行移动平均的数据需要按天分组(都是带时间戳的数据),然后需要对其应用可变移动平均窗口。
我已经有一个按天(和@id)对数据进行分组的函数,该函数显示在底部。我有几个问题:
在移动平均函数内调用分组函数更好还是应该一次完成所有操作?
是否可以获取日期的移动平均值输入到函数中,但返回 n 天开始移动平均线,以便返回数据的前 n 天的平均值不会为 0?(即,如果他们想要从 2011 年 8 月 1 日到 2011 年 8 月 2 日的 7 天移动平均线,那么我在 2011 年 1 月 1 日开始移动平均线计算,以便他们定义的第一天有一个值?
)我正在研究如何进行移动平均线,并且知道移动窗口似乎是最佳选择(currentSum = prevSum + TodayCount - nthDayAgoCount) / nDays但我仍在研究弄清楚这个的 SQL 实现。
我有一个如下所示的分组函数(为了可见性而删除了一些变量):
SELECT
'ALL' as GeogType,
CAST(v.AdmissionOn as date) as dtAdmission,
CASE WHEN @id IS NULL THEN 99 ELSE v.ID END,
COUNT(*) as nVisits
FROM dbo.Table1 v INNER JOIN dbo.Table2 t ON v.FSLDU = t.FSLDU5
WHERE v.AdmissionOn >= '01-01-2010' AND v.AdmissionOn < DATEADD(day,1,'02-01-2010')
AND v.ID = Coalesce(@id,ID)
GROUP BY
CAST(v.AdmissionOn as date),
CASE WHEN @id IS NULL THEN 99 ELSE v.ID END
ORDER BY 2,3,4
它返回一个像这样的表:
ALL 2010-01-01 1 103
ALL 2010-01-02 1 114
ALL 2010-01-03 1 86
ALL 2010-01-04 1 88
ALL 2010-01-05 1 84
ALL 2010-01-06 1 87
ALL 2010-01-07 1 82
编辑:要回答我问的第一个问题:
我最终创建了一个声明的函数一个临时表,并将 count 函数的结果插入其中,然后使用 user662852
中的示例来计算移动平均值。
I am trying to create a function that computes a windowed moving average in SQLServer 2008. I am quite new to SQL so I am having a fair bit of difficulty. The data that I am trying to perform the moving average on needs to be grouped by day (it is all timestamped data) and then a variable moving average window needs to be applied to it.
I already have a function that groups the data by day (and @id) which is shown at the bottom. I have a few questions:
Would it be better to call the grouping function inside the moving average function or should I do it all at once?
Is it possible to get the moving average for the dates input into the function, but go back n days to begin the moving average so that the first n days of the returned data will not have 0 for their average? (ie. if they want a 7 day moving average from 01-08-2011 to 02-08-2011 that I start the moving average calculation on 01-01-2011 so that the first day they defined has a value?)
I am in the process of looking into how to do the moving average, and know that a moving window seems to be the best option (currentSum = prevSum + todayCount - nthDayAgoCount) / nDays but I am still working on figuring out the SQL implementation of this.
I have a grouping function that looks like this (some variables removed for visibility purposes):
SELECT
'ALL' as GeogType,
CAST(v.AdmissionOn as date) as dtAdmission,
CASE WHEN @id IS NULL THEN 99 ELSE v.ID END,
COUNT(*) as nVisits
FROM dbo.Table1 v INNER JOIN dbo.Table2 t ON v.FSLDU = t.FSLDU5
WHERE v.AdmissionOn >= '01-01-2010' AND v.AdmissionOn < DATEADD(day,1,'02-01-2010')
AND v.ID = Coalesce(@id,ID)
GROUP BY
CAST(v.AdmissionOn as date),
CASE WHEN @id IS NULL THEN 99 ELSE v.ID END
ORDER BY 2,3,4
Which returns a table like so:
ALL 2010-01-01 1 103
ALL 2010-01-02 1 114
ALL 2010-01-03 1 86
ALL 2010-01-04 1 88
ALL 2010-01-05 1 84
ALL 2010-01-06 1 87
ALL 2010-01-07 1 82
EDIT: To answer the first question I asked:
I ended up creating a function which declared a temporary table and inserted the results from the count function into it, then used the example from user662852
to compute the moving average.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
从查询中取出硬编码的日期范围。将输出(如最后的示例)写入临时表(我在下面将其称为#visits)。
尝试对临时表进行这种自连接:
编辑:我在评论中没有足够的空间来回答您的问题:
我的连接是“有点笛卡尔”,因为它在连接约束。列表中的每条记录都会与其他所有记录进行比较,然后我想要报告日期介于 (-7) 天和今天之间的记录。每个数据日期都可以列出日期,这是您问题的关键。我本来可以将连接条件写为
但真正发生的情况是我测试了它,
它不返回任何记录,因为语法是“介于低和高之间”。我对 0 条记录进行了捂脸,交换了参数,仅此而已。
尝试以下操作,看看我的意思: 这是仅一个列表日期的笛卡尔连接:
将其与实际连接条件进行比较
看看所有记录中的列表日期在 datadate 和 dataplus6 之间如何?
Take the hardcoded date range out of your query. Write the output (like your sample at the end) to a temp table (I called it #visits below).
Try this self join to the temp table:
EDIT: I didn't have enough room in Comments to say this in response to your question:
My join is "kinda cartesian" because it uses a between in the join constraint. Each record in list is going up against every other record, and then I want the ones where the date I report is between a lower bound of (-7) days and today. Every data date is available to list date, this is the key to your question. I could have written the join condition as
But what really happened was I tested it as
Which returns no records because the syntax is "Between LOW and HIGH". I facepalmed on 0 records and swapped the arguments, that's all.
Try the following, see what I mean: This is the cartesian join for just one listdate:
Compare this to the actual join condition
See how list date is between datadate and dataplus6 in all the records?