使用 SQL 查找冲突的日期间隔
假设我在 Sql Server 2008 中有下表:
ItemId StartDate EndDate
1 NULL 2011-01-15
2 2011-01-16 2011-01-25
3 2011-01-26 NULL
如您所见,该表具有 StartDate 和 EndDate 列。我想验证这些列中的数据。间隔不能互相冲突。因此,上表有效,但下一个表无效,因为第一行的结束日期大于第二行的开始日期。
ItemId StartDate EndDate
1 NULL 2011-01-17
2 2011-01-16 2011-01-25
3 2011-01-26 NULL
NULL
在这里表示无穷大。
你能帮我写一个数据验证脚本吗?
[第二个任务]
感谢您的回答。 我有一个并发症。假设,我有这样的表:
ItemId IntervalId StartDate EndDate 1 1 空 2011-01-15 2 1 2011-01-16 2011-01-25 3 1 2011-01-26 空 4 2 空 2011-01-17 5 2 2011-01-16 2011-01-25 6 2 2011-01-26 空
在这里,我想验证一组 IntervalId
内的间隔,但不是整个表内的间隔。因此,间隔 1 有效,但间隔 2 无效。
还有。是否可以向表添加约束以避免此类无效记录?
[最终解决方案]
我创建了函数来检查间隔是否冲突:
CREATE FUNCTION [dbo].[fnIntervalConflict]
(
@intervalId INT,
@originalItemId INT,
@startDate DATETIME,
@endDate DATETIME
)
RETURNS BIT
AS
BEGIN
SET @startDate = ISNULL(@startDate,'1/1/1753 12:00:00 AM')
SET @endDate = ISNULL(@endDate,'12/31/9999 11:59:59 PM')
DECLARE @conflict BIT = 0
SELECT TOP 1 @conflict = 1
FROM Items
WHERE IntervalId = @intervalId
AND ItemId <> @originalItemId
AND (
(ISNULL(StartDate,'1/1/1753 12:00:00 AM') >= @startDate
AND ISNULL(StartDate,'1/1/1753 12:00:00 AM') <= @endDate)
OR (ISNULL(EndDate,'12/31/9999 11:59:59 PM') >= @startDate
AND ISNULL(EndDate,'12/31/9999 11:59:59 PM') <= @endDate)
)
RETURN @conflict
END
然后我向表中添加了 2 个约束:
ALTER TABLE dbo.Items ADD CONSTRAINT
CK_Items_Dates CHECK (StartDate IS NULL OR EndDate IS NULL OR StartDate <= EndDate)
GO
我
ALTER TABLE dbo.Items ADD CONSTRAINT
CK_Items_ValidInterval CHECK (([dbo].[fnIntervalConflict]([IntervalId], ItemId,[StartDate],[EndDate])=(0)))
GO
知道,第二个约束会减慢插入和更新操作,但这并不是很重要对于我的申请。 而且,现在我可以在插入和更新表中的数据之前从应用程序代码中调用函数 fnIntervalConflict
。
Suppose I have following table in Sql Server 2008:
ItemId StartDate EndDate
1 NULL 2011-01-15
2 2011-01-16 2011-01-25
3 2011-01-26 NULL
As you can see, this table has StartDate and EndDate columns. I want to validate data in these columns. Intervals cannot conflict with each other. So, the table above is valid, but the next table is invalid, becase first row has End Date greater than StartDate in the second row.
ItemId StartDate EndDate
1 NULL 2011-01-17
2 2011-01-16 2011-01-25
3 2011-01-26 NULL
NULL
means infinity here.
Could you help me to write a script for data validation?
[The second task]
Thanks for the answers.
I have a complication. Let's assume, I have such table:
ItemId IntervalId StartDate EndDate 1 1 NULL 2011-01-15 2 1 2011-01-16 2011-01-25 3 1 2011-01-26 NULL 4 2 NULL 2011-01-17 5 2 2011-01-16 2011-01-25 6 2 2011-01-26 NULL
Here I want to validate intervals within a groups of IntervalId
, but not within the whole table. So, Interval 1 will be valid, but Interval 2 will be invalid.
And also. Is it possible to add a constraint to the table in order to avoid such invalid records?
[Final Solution]
I created function to check if interval is conflicted:
CREATE FUNCTION [dbo].[fnIntervalConflict]
(
@intervalId INT,
@originalItemId INT,
@startDate DATETIME,
@endDate DATETIME
)
RETURNS BIT
AS
BEGIN
SET @startDate = ISNULL(@startDate,'1/1/1753 12:00:00 AM')
SET @endDate = ISNULL(@endDate,'12/31/9999 11:59:59 PM')
DECLARE @conflict BIT = 0
SELECT TOP 1 @conflict = 1
FROM Items
WHERE IntervalId = @intervalId
AND ItemId <> @originalItemId
AND (
(ISNULL(StartDate,'1/1/1753 12:00:00 AM') >= @startDate
AND ISNULL(StartDate,'1/1/1753 12:00:00 AM') <= @endDate)
OR (ISNULL(EndDate,'12/31/9999 11:59:59 PM') >= @startDate
AND ISNULL(EndDate,'12/31/9999 11:59:59 PM') <= @endDate)
)
RETURN @conflict
END
And then I added 2 constraints to my table:
ALTER TABLE dbo.Items ADD CONSTRAINT
CK_Items_Dates CHECK (StartDate IS NULL OR EndDate IS NULL OR StartDate <= EndDate)
GO
and
ALTER TABLE dbo.Items ADD CONSTRAINT
CK_Items_ValidInterval CHECK (([dbo].[fnIntervalConflict]([IntervalId], ItemId,[StartDate],[EndDate])=(0)))
GO
I know, the second constraint slows insert and update operations, but it is not very important for my application.
And also, now I can call function fnIntervalConflict
from my application code before inserts and updates of data in the table.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
像这样的东西应该给你所有重叠的时期
编辑为阿德里安评论如下
Something like this should give you all overlaping periods
Edited for Adrians comment bellow
这将为您提供不正确的行。
添加了
ROW_NUMBER()
因为我不知道所有条目是否按顺序排列。编辑:
至于更新的问题:
只需将
PARTITION BY
子句添加到ROW_NUMBER()
查询并更改连接即可。This will give you the rows that are incorrect.
Added
ROW_NUMBER()
as I didnt know if all entries where in order.EDIT:
As for the updated question:
Just add a
PARTITION BY
clause to theROW_NUMBER()
query and alter the join.结果:
Result:
与 OP 没有直接关系,但自从 Adrian 表达了兴趣以来。这是 SQL Server 维护的表的完整性,确保任何时候都只存在一个有效值。在这种情况下,我正在处理当前/历史表,但是可以修改示例以处理未来的数据(尽管在这种情况下,您不能拥有索引视图,并且需要直接编写合并的,而不是通过触发器维护)。
在这种特殊情况下,我正在处理一个要跟踪其历史记录的链接表。首先,我们要链接的表:
现在,如果我们要构建一个普通表,我们将有以下内容(不要运行这个):
但是,我们想要一个可以代表了一段完整的历史。特别是,我们希望设计结构,使得数据库中永远不会出现重叠的时间段。我们总是知道哪条记录在任何特定时间都是有效的:
这有很多限制。维护此表的唯一方法是通过合并语句(请参见下面的示例,并尝试自己推理原因)。我们现在将构建一个模仿上面定义的
ClientAnswers
表的视图:并且我们拥有我们最初想要的 PK 约束。我们还使用
ISNULL
恢复IntValue
列的not null
状态(尽管检查约束已经保证了这一点,SQL Server无法得出此信息)。如果我们使用 ORM,我们让它以ClientAnswers
为目标,并且历史记录会自动构建。接下来,我们可以有一个让我们回顾过去的函数:最后,我们需要
ClientAnswers
上的触发器来构建此历史记录。我们需要使用合并语句,因为我们需要同时插入新行,并更新前一个“有效”行以使用新的 ValidTo 值结束日期。显然,我可以构建一个更简单的表(不是连接表),但这是我的标准首选示例(尽管我花了一段时间来重建它 - 我忘记了
set nocount on
语句一阵子)。但这里的优势在于,基表ClientAnswerHistories
无法存储相同ClientID
和DataItemID< 的重叠时间范围/代码>值。
当您需要处理临时外键时,事情会变得更加复杂。
当然,如果您不想要任何真正的间隙,那么您可以删除
Deleted
列(以及相关的检查),使not null
列真正not null
,修改insert
触发器以执行普通插入,并让delete
触发器引发错误。Not directly related to the OP, but since Adrian's expressed an interest. Here's a table than SQL Server maintains the integrity of, ensuring that only one valid value is present at any time. In this case, I'm dealing with a current/history table, but the example can be modified to work with future data also (although in that case, you can't have the indexed view, and you need to write the merge's directly, rather than maintaining through triggers).
In this particular case, I'm dealing with a link table that I want to track the history of. First, the tables that we're linking:
Now, if we were building a normal table, we'd have the following (Don't run this one):
But, we want a table that can represent a complete history. In particular, we want to design the structure such that overlapping time periods can never appear in the database. We always know which record was valid at any particular time:
That's a lot of constraints. The only way to maintain this table is through merge statements (see examples below, and try to reason about why yourself). We're now going to build a view that mimics that
ClientAnswers
table defined above:And we have the PK constraint we originally wanted. We've also used
ISNULL
to reinstate thenot null
-ness of theIntValue
column (even though the check constraints already guarantee this, SQL Server is unable to derive this information). If we're working with an ORM, we let it targetClientAnswers
, and the history gets automatically built. Next, we can have a function that lets us look back in time:And finally, we need the triggers on
ClientAnswers
that build this history. We need to use merge statements, since we need to simultaneously insert new rows, and update the previous "valid" row to end date it with a new ValidTo value.Obviously, I could have built a simpler table (not a join table), but this is my standard go-to example (albeit it took me a while to reconstruct it - I forgot the
set nocount on
statements for a while). But the strength here is that, the base table,ClientAnswerHistories
is incapable of storing overlapping time ranges for the sameClientID
andDataItemID
values.Things get more complex when you need to deal with temporal foreign keys.
Of course, if you don't want any real gaps, then you can remove the
Deleted
column (and associated checks), make thenot null
columns reallynot null
, modify theinsert
trigger to do a plain insert, and make thedelete
trigger raise an error instead.如果我的数据永远不会有重叠的间隔,我总是采取稍微不同的设计方法......即不存储间隔,而只存储开始时间。然后,获得有助于显示间隔的视图。
因此,示例数据可能如下所示:
并且一个简单的 SELECT * FROMIntervals 会产生:
I've always taken a slightly different approach to the design if I have data that is never to have overlapping intervals... namely don't store intervals, but only start times. Then, have a view that helps with displaying the intervals.
So, sample data might look like:
and a simple
SELECT * FROM intervals
yields: