如何根据 T-SQL 中前几个月的数据确定缺失月份的值
我有一组在特定时间点发生的事务:
CREATE TABLE Transactions (
TransactionDate Date NOT NULL,
TransactionValue Integer NOT NULL
)
数据可能是:
INSERT INTO Transactions (TransactionDate, TransactionValue)
VALUES ('1/1/2009', 1)
INSERT INTO Transactions (TransactionDate, TransactionValue)
VALUES ('3/1/2009', 2)
INSERT INTO Transactions (TransactionDate, TransactionValue)
VALUES ('6/1/2009', 3)
假设 TransactionValue 设置某种级别,我需要知道事务之间的级别是什么。 我需要在一组 T-SQL 查询的上下文中执行此操作,因此最好能够获得如下结果集:
Month Value
1/2009 1
2/2009 1
3/2009 2
4/2009 2
5/2009 2
6/2009 3
请注意,对于每个月,我们要么获取事务中指定的值,要么获取最近的非空值。
我的问题是我不知道该怎么做! 我只是一个“中级”级别的 SQL 开发人员,我不记得以前见过这样的事情。 当然,我可以在程序中或使用游标创建我想要的数据,但我想知道是否有更好的、面向集合的方法来做到这一点。
我正在使用 SQL Server 2008,因此如果有任何新功能有帮助,我很想听听。
PS如果有人能想出更好的方式来陈述这个问题,甚至更好的主题行,我将不胜感激。 我花了很长时间才决定“传播”虽然很蹩脚,但却是我能想到的最好的办法。 “涂抹”听起来更糟糕。
I have a set of transactions occurring at specific points in time:
CREATE TABLE Transactions (
TransactionDate Date NOT NULL,
TransactionValue Integer NOT NULL
)
The data might be:
INSERT INTO Transactions (TransactionDate, TransactionValue)
VALUES ('1/1/2009', 1)
INSERT INTO Transactions (TransactionDate, TransactionValue)
VALUES ('3/1/2009', 2)
INSERT INTO Transactions (TransactionDate, TransactionValue)
VALUES ('6/1/2009', 3)
Assuming that the TransactionValue sets some kind of level, I need to know what the level was between the transactions. I need this in the context of a set of T-SQL queries, so it would be best if I could get a result set like this:
Month Value
1/2009 1
2/2009 1
3/2009 2
4/2009 2
5/2009 2
6/2009 3
Note how, for each month, we either get the value specified in the transaction, or we get the most recent non-null value.
My problem is that I have little idea how to do this! I'm only an "intermediate" level SQL Developer, and I don't remember ever seeing anything like this before. Naturally, I could create the data I want in a program, or using cursors, but I'd like to know if there's a better, set-oriented way to do this.
I'm using SQL Server 2008, so if any of the new features will help, I'd like to hear about it.
P.S. If anyone can think of a better way to state this question, or even a better subject line, I'd greatly appreciate it. It took me quite a while to decide that "spread", while lame, was the best I could come up with. "Smear" sounded worse.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
约翰·吉布(John Gibb)发布了一个很好的答案,已经被接受,但我想对其进行一些扩展:
明确的方式,并
数字表。
这个细微的变化使用递归公用表表达式来建立一组表示 DateRange 中定义的起始日期和截止日期或之后的每个月的第一天的日期。 请注意使用 MAXRECURSION 选项来防止堆栈溢出(!); 根据需要进行调整以适应预期的最大月数。 另外,请考虑添加替代日期组装逻辑以支持周、季度甚至每天。
John Gibb posted a fine answer, already accepted, but I wanted to expand on it a bit to:
explicit manner, and
numbers table.
This slight variation uses a recursive common table expression to establish the set of Dates representing the first of each month on or after from and to dates defined in DateRange. Note the use of the MAXRECURSION option to prevent a stack overflow (!); adjust as necessary to accommodate the maximum number of months expected. Also, consider adding alternative Dates assembly logic to support weeks, quarters, even day-to-day.
如果您经常进行此类分析,您可能会对我为此目的而组合的 SQL Server 函数感兴趣:
这是对此的答案 问题,其中也有一些示例输出。
If you do this type of analysis often, you might be interested in this SQL Server function I put together for exactly this purpose:
it was an answer to this question, which also has some sample output from it.
我无法从手机访问 BOL,因此这是一个粗略指南...
首先,您需要生成没有数据的月份的缺失行。 您可以使用 OUTER 联接到具有所需时间跨度的固定表或临时表,或者使用以编程方式创建的数据集(存储过程等)。
其次,您应该查看新的 SQL 2008“分析”函数,例如 MAX(value ) OVER (分区子句) 获取先前的值。
(我知道 Oracle 可以做到这一点,因为我需要它来计算交易日期之间的复利计算 - 确实有同样的问题)
希望这为您指明了正确的方向...
(避免将其扔到临时表中并将光标放在它上面。太原油!!!)
I don't have access to BOL from my phone so this is a rough guide...
First, you need to generate the missing rows for the months you have no data. You can either use a OUTER join to a fixed table or temp table with the timespan you want or from a programmatically created dataset (stored proc or suchlike)
Second, you should look at the new SQL 2008 'analytic' functions, like MAX(value) OVER ( partition clause ) to get the previous value.
(I KNOW Oracle can do this 'cause I needed it to calculate compounded interest calcs between transaction dates - same problem really)
Hope this points you in the right direction...
(Avoid throwing it into a temp table and cursoring over it. Too crude!!!)
-----替代方式------
-----Alternative way------
我首先构建一个数字表,其中包含从 1 到一百万左右的连续整数。 一旦你掌握了它的窍门,它们就会非常方便。
例如,以下是如何获取 2008 年每个月的 1 号:
现在,您可以使用 OUTER APPLY 将其放在一起,以查找每个日期的最新交易,如下所示:
这应该会为您提供所需的内容,但是您可能需要谷歌一下才能找到创建 Numbers 表的最佳方法。
I'd start by building a Numbers table holding sequential integers from 1 to a million or so. They come in really handy once you get the hang of it.
For example, here is how to get the 1st of every month in 2008:
Now, you can put that together using OUTER APPLY to find the most recent transaction for each date like so:
This should give you what you're looking for, but you might have to Google around a little to find the best way to create the Numbers table.
这就是我想出的
here's what i came up with
要以基于集合的方式完成此操作,您需要所有数据或信息的集合。 在这种情况下,存在被忽视的数据“有哪些月份?” 在数据库中拥有“日历”表和“数字”表作为实用程序表非常有用。
这是使用其中一种方法的解决方案。 第一段代码设置您的日历表。 您可以使用光标或手动或其他方式填写它,并且可以将其限制为您的业务所需的任何日期范围(回到 1900-01-01 或只是回到 1970-01-01 以及尽可能远的未来)想)。 您还可以添加对您的业务有用的任何其他列。
现在,使用这张表你的问题就变得微不足道了:
To do it in a set-based way, you need sets for all of your data or information. In this case there's the overlooked data of "What months are there?" It's very useful to have a "Calendar" table as well as a "Number" table in databases as utility tables.
Here's a solution using one of these methods. The first bit of code sets up your calendar table. You can fill it using a cursor or manually or whatever and you can limit it to whatever date range is needed for your business (back to 1900-01-01 or just back to 1970-01-01 and as far into the future as you want). You can also add any other columns that are useful for your business.
Now, using this table your question becomes trivial: