打开和关闭外键约束
我们有一个漫长的 ETL 过程,该过程将输入文件中的数据流过一系列表。
我正在考虑向我们的表添加关系完整性的可能性不大,但我不希望我们的 ETL 流程在遇到违规时终止。我还希望仍然加载违反引用完整性的记录。然而,最后,我想了解所有违反引用完整性的行为。
方法一: 我可以关闭引用完整性并编写一个 SQL 过程来运行一堆存储过程来识别违反关系完整性的记录,但我真的很喜欢关系完整性位于表本身的想法,因为我觉得这将数据库记录在最好的地方——数据库。
方法二: 我认为我们应该在流程开始时删除所有引用完整性,然后在最后添加它,而不是编写一组自定义查询来识别违规者。当我们发现例外情况时,我们就知道存在违规行为。我有点喜欢这种方法,但与方法 1 不同,方法 1 可以编写 SQL 来仅针对刚刚为潜在违规者添加的记录,添加引用完整性可能会重新检查整个表 - 一个不断增长的表。当引用完整性重新打开时,数据的使用者可以确信数据是“好的”,而无需再执行任何动态查询。我喜欢这样......
还有第三种方法吗? 我看到 T-SQL 支持类似的命令
NOCHECK CONSTRAINT
ON UPDATE NO ACTION
ON INSERT NO ACTION
,但我不确定它们的真正用途。例如,
ALTER TABLE dbo.TableName NOCHECK CONSTRAINT FK01
这样做的目的是在您拥有可靠来源时关闭构造检查吗?我假设如果以这种方式关闭然后打开,则设置 chg 仅适用于将来的操作。
您将使用什么最佳方法来允许流程完成到底,并且仍然识别所有关系完整性或潜在的关系完整性违规?
We have a lengthy ETL process that flows data from input files through a series of tables.
I am considerin the unlikely possibility of adding relational integrity to our tables but I do not want our ETL process to die when it encounters a violation. I also want the records that violate the referential integrity to still be loaded. However, in the end, I want to be aware of all of the violations of referential integrity.
Method 1:
I could leave referential integrity off and write a SQL procedure that runs a bunch of stored procedures to identify records that violate relational integrity, but I really like the idea of the relation integrity being on the tables themselves because i feel that this documents the database in the best place - the db.
Method 2:
Instead of write a set of custom queries to identify violators, I am thinking that we should drop all ref integrity in the beginning of our process and then add it at the end. Where we get exceptions, we know there are violations. I kind of like this approach, but uunlike Method 1 where a SQL can be written to target only the records just added for potential violators, adding ref integrity back will probably reexamine the whole table -- a table that is ever growing. When ref integrity is turned back on, the consumer of the data can be assured that the data is "good" without peforming any more on the fly queries. I like that...
Is there a third approach?
I see T-SQL supports commands like
NOCHECK CONSTRAINT
ON UPDATE NO ACTION
ON INSERT NO ACTION
but I am not sure how they are really intended to be used. For example,
ALTER TABLE dbo.TableName
NOCHECK CONSTRAINT FK01
Is the intent of this to turn off constrationmt checking when you have a reliable source? I assume that if it is off and then turned on in this manner that the setting chg only applies to future operations.
What best approach would you use to allow a process to complete to the end and still identify all of the relational integrity or would-be relational integrity violations?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我个人永远不会关闭 FK 约束。这是滑向地狱的开始。他们在那里是有原因的。
我会将您的 ETL 分成 N 行批次,并将每个批次包装在一个事务中。如果事务由于 FK 违规而失败,请将其记录下来并执行您的恢复要求。切勿留下不良数据。
I would personally never turn off FK constraints. It's the start of a slippery decline into hell. They are there for a reason.
I would split your ETL into batches of N rows and wrap each in a transaction. If the transaction fails due to a FK violation, log it and do whatever your recovery requirement is. Never leave bad data in.
1.我发现
NO ACTION
名称有点误导,因为它意味着如果违反约束,DML 将失败。一些 RDMS,特别是 mysql,有一个更好的关键字 -RESTRICT
,它更具描述性。2. 您可以使用 ALTER TABLE ... NOCHECK/CHECK CONSTRAINT ALL 临时禁用/启用所有约束
1.I found
NO ACTION
name is a bit misleading because it means DML will fail if it violates constraint. Some RDMS, mysql in particular, have a better keyword -RESTRICT
which is more descriptive.2. You can temporary disable/enable all constraints with
ALTER TABLE ... NOCHECK/CHECK CONSTRAINT ALL