SQL 中的日期范围交集分割
我有一个 SQL Server 2005 数据库,其中包含一个名为“成员资格”的表。
表架构是:
PersonID int, Surname nvarchar(30), FirstName nvarchar(30), Description nvarchar(100), StartDate datetime, EndDate datetime
我目前正在开发一个网格功能,该功能显示按人员划分的成员资格细目。要求之一是拆分存在日期范围交集的成员资格行。交集必须受姓氏和名字约束,即分割仅发生在具有相同姓氏和名字的成员记录中。
示例表数据:
18 Smith John Poker Club 01/01/2009 NULL 18 Smith John Library 05/01/2009 18/01/2009 18 Smith John Gym 10/01/2009 28/01/2009 26 Adams Jane Pilates 03/01/2009 16/02/2009
预期结果集:
18 Smith John Poker Club 01/01/2009 04/01/2009 18 Smith John Poker Club / Library 05/01/2009 09/01/2009 18 Smith John Poker Club / Library / Gym 10/01/2009 18/01/2009 18 Smith John Poker Club / Gym 19/01/2009 28/01/2009 18 Smith John Poker Club 29/01/2009 NULL 26 Adams Jane Pilates 03/01/2009 16/02/2009
有谁知道我如何编写一个存储过程来返回具有上述细分的结果集。
I have a SQL Server 2005 database which contains a table called Memberships.
The table schema is:
PersonID int, Surname nvarchar(30), FirstName nvarchar(30), Description nvarchar(100), StartDate datetime, EndDate datetime
I'm currently working on a grid feature which shows a break-down of memberships by person. One of the requirements is to split membership rows where there is an intersection of date ranges. The intersection must be bound by the Surname and FirstName, ie splits only occur with membership records of the same Surname and FirstName.
Example table data:
18 Smith John Poker Club 01/01/2009 NULL 18 Smith John Library 05/01/2009 18/01/2009 18 Smith John Gym 10/01/2009 28/01/2009 26 Adams Jane Pilates 03/01/2009 16/02/2009
Expected result set:
18 Smith John Poker Club 01/01/2009 04/01/2009 18 Smith John Poker Club / Library 05/01/2009 09/01/2009 18 Smith John Poker Club / Library / Gym 10/01/2009 18/01/2009 18 Smith John Poker Club / Gym 19/01/2009 28/01/2009 18 Smith John Poker Club 29/01/2009 NULL 26 Adams Jane Pilates 03/01/2009 16/02/2009
Does anyone have any idea how I could write a stored procedure that will return a result set which has the break-down described above.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
对于这个问题,您将遇到的问题是,随着数据集的增长,使用 TSQL 解决它的解决方案将无法很好地扩展。下面使用一系列动态构建的临时表来解决该问题。它使用数字表将每个日期范围条目拆分为各自的日期。这是它无法扩展的地方,主要是由于开放范围的 NULL 值看起来无穷大,因此您必须将固定日期交换到遥远的未来,从而将转换范围限制为可行的时间长度。通过构建具有适当索引的日期表或日历表来优化每天的渲染,您可能会看到更好的性能。
拆分范围后,将使用 XML PATH 合并描述,以便范围系列中的每一天都包含为其列出的所有描述。按 PersonID 和日期进行行编号允许使用两个 NOT EXISTS 检查来查找每个范围的第一行和最后一行,以查找匹配的 PersonID 和描述集的前一行不存在的实例,或者下一行不存在的实例不存在匹配的 PersonID 和 Description 集。
然后使用 ROW_NUMBER 对该结果集重新编号,以便将它们配对以构建最终结果。
上述解决方案还将处理其他描述之间的间隙,因此,如果您要为 PersonID 18 添加另一个描述并留下间隙:
它将适当地填充间隙。正如评论中所指出的,您不应该在此表中包含姓名信息,它应该标准化为可以在最终结果中联接的人员表。我通过使用 SELECT DISTINCT 构建临时表来创建该 JOIN 来模拟另一个表。
The problem you are going to have with this problem is that as the data set grows, the solutions to solve it with TSQL won't scale well. The below uses a series of temporary tables built on the fly to solve the problem. It splits each date range entry into its respective days using a numbers table. This is where it won't scale, primarily due to your open ranged NULL values which appear to be inifinity, so you have to swap in a fixed date far into the future that limits the range of conversion to a feasible length of time. You could likely see better performance by building a table of days or a calendar table with appropriate indexing for optimized rendering of each day.
Once the ranges are split, the descriptions are merged using XML PATH so that each day in the range series has all of the descriptions listed for it. Row Numbering by PersonID and Date allows for the first and last row of each range to be found using two NOT EXISTS checks to find instances where a previous row doesn't exist for a matching PersonID and Description set, or where the next row doesn't exist for a matching PersonID and Description set.
This result set is then renumbered using ROW_NUMBER so that they can be paired up to build the final results.
The above solution will also handle gaps between additional Descriptions as well, so if you were to add another Description for PersonID 18 leaving a gap:
It will fill the gap appropriately. As pointed out in the comments, you shouldn't have name information in this table, it should be normalized out to a Persons Table that can be JOIN'd to in the final result. I simulated this other table by using a SELECT DISTINCT to build a temp table to create that JOIN.
试试这个
Try this
[只有很多很多年之后。]
我创建了一个存储过程,它将按单个表中的分区对齐和打断段,然后您可以使用这些对齐的打断点,使用子查询和 XML PATH 将描述旋转到不规则的列中。
查看以下内容是否有帮助:
文档:https://github.com/Quebe/SQL-Algorithms/blob/master/Temporal/Date%20Segment%20Manipulation/DateSegments_AlignWithinTable.md
存储过程:https://github.com/Quebe/SQL-Algorithms/blob/master/Temporal/Date%20Segment%20Manipulation/DateSegments_AlignWithinTable.sql
例如,您的调用可能如下所示:
您将希望捕获结果(这是一个表)到另一个表或临时表(假设在下面的示例中称为“AlignedDataTable”)。然后,您可以使用子查询进行透视。
[Only many, many years later.]
I created a stored procedure that will align and break segments by a partition within a single table, and then you can use those aligned breaks to pivot the description into a ragged column using a subquery and XML PATH.
See if the below help:
Documentation: https://github.com/Quebe/SQL-Algorithms/blob/master/Temporal/Date%20Segment%20Manipulation/DateSegments_AlignWithinTable.md
Stored Procedure: https://github.com/Quebe/SQL-Algorithms/blob/master/Temporal/Date%20Segment%20Manipulation/DateSegments_AlignWithinTable.sql
For example, your call might look like:
You will want to capture the result (which is a table) into another table or temporary table (assuming it is called "AlignedDataTable" in below example). Then, you can pivot using a subquery.