在 T-SQL 中查找开始和结束日期(基于设置)
我有以下内容。
Name Date
A 2011-01-01 01:00:00.000
A 2011-02-01 02:00:00.000
A 2011-03-01 03:00:00.000
B 2011-04-01 04:00:00.000
A 2011-05-01 07:00:00.000
所需的输出是
Name StartDate EndDate
-------------------------------------------------------------------
A 2011-01-01 01:00:00.000 2011-04-01 04:00:00.000
B 2011-04-01 04:00:00.000 2011-05-01 07:00:00.000
A 2011-05-01 07:00:00.000 NULL
如何在基于集合的方法中使用 TSQL 实现相同的目标。
DDL 如下
DECLARE @t TABLE(PersonName VARCHAR(32), [Date] DATETIME)
INSERT INTO @t VALUES('A', '2011-01-01 01:00:00')
INSERT INTO @t VALUES('A', '2011-01-02 02:00:00')
INSERT INTO @t VALUES('A', '2011-01-03 03:00:00')
INSERT INTO @t VALUES('B', '2011-01-04 04:00:00')
INSERT INTO @t VALUES('A', '2011-01-05 07:00:00')
Select * from @t
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
因为 CTE 的结果通常不会具体化
如果你实现了以下目标,你可能会发现你会获得更好的表现
自己的中间结果如下。
Because the result of CTEs are not generally materialised however
you may well find you get better performance if you materialize the
intermediate result yourself as below.
基本上,对于每个
Date
,我们都会找到其后最近的日期,以便与不同的PersonName
关联。这给了我们EndDate
,它现在可以区分同一个人的连续日期组。现在我们只需要按
PersonName
& 对数据进行分组即可。EndDate
并获取每个组中最小的Date
作为StartDate
。是的,当然可以按StartDate
对数据进行排序。Basically, for every
Date
we find the nearest date after it such that is associated with a differentPersonName
. That gives usEndDate
, which now distinguishes for us consecutive groups of dates for the same person.Now we only need to group the data by
PersonName
&EndDate
and get the minimalDate
in every group asStartDate
. And yes, sort the data byStartDate
, of course.获取行号,以便您知道上一条记录在哪里。然后,记录一条记录及其后的下一条记录。当状态改变时,我们有一个候选行。
使用 PostgreSQL 进行了测试,但也应该适用于 SQL Server
Get a row number so you will know where the previous record is. Then, take a record and the next record after it. When the state changes we have a candidate row.
Tested with PostgreSQL but should work with SQL Server as well
cte 的另一个答案是一个很好的答案。另一种选择是在任何情况下迭代集合。它不是基于设置的,但它是另一种方法。
您将需要迭代到 A. 为与其事务相对应的每条记录分配一个唯一的 id,或者 B. 以实际获取输出。
TSQL 并不适合迭代记录,尤其是当您有很多记录时,因此我会推荐其他一些方法,例如小型 .net 程序或更擅长迭代的程序。
The other answer with the cte is a good one. Another option would be to iterate over the collection in any case. It's not set based, but it is another way to do it.
You will need to iterate to either A. assign a unique id to each record that corresponds to its transaction, or B. to actually get your output.
TSQL is not ideal for iterating over records, especially if you have a lot, and so I would recommend some other way of doing it, a small .net program or something that is better at iterating.
有一种非常快速的方法可以使用一些间隙和岛屿理论来做到这一点:
There's a very quick way to do this using a bit of Gaps and Islands theory: