SQL Server更改数据捕获 - 验证增量窗口
我想使用SQL Server更改数据捕获来实现增量加载过程。我发现的每个例子都采用“快乐的道路”。
换句话说,他们认为CDC历史记录超过了上一次成功的增量负载以来的时间。
假设我们将清理工作留下3天的默认工作,出于某种原因,我们的负载未成功完成的时间比这更长。我需要检查一下并运行完整的提取物。
我正在在SQL Server表中记录成功的执行日期。因此,如果我将最后一个成功的日期与cdc.lsn_time_mapping
表中的最早记录进行比较,这是否可以完成我的任务?
基本上是类似的:
Select @LastSuccessfulDate from audit....
Select @MinCdCDate = min(tran_begin_time) from cdc.lsn_time_mapping
if @MinCdCDate > @LastSuccessfulDate then 'Full' else 'Incremental'
这个工作应该吗?有更好的方法可以完成吗?
I want to implement an incremental load process using SQL Server Change Data Capture. Every example I find takes the "happy path."
In other words, they assume that the CDC history exceeds the time since the last successful incremental load.
Suppose we leave the cleanup job with the default of 3 days, and for some reason our load hasn't successfully completed for longer than that. I need to check for this and run a full extract instead.
I'm logging the successful execution datetime in SQL Server tables. So, if I compare the last successful date to the earliest record in the cdc.lsn_time_mapping
table, will this accomplish my task?
Basically something like:
Select @LastSuccessfulDate from audit....
Select @MinCdCDate = min(tran_begin_time) from cdc.lsn_time_mapping
if @MinCdCDate > @LastSuccessfulDate then 'Full' else 'Incremental'
Should this work? Is there a better way to accomplish it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
直接与CDC合作时,我总是会留在“日志域”中。因此,跟踪最后运行的最后一个LSN,并将其与 sys.fn_cdc_get_min_lsn 每次您同步时。
因此,如果您上次在LSN = 100同步,而MIN_LSN = 110,则您的差距为10个丢失的日志记录。
但这只是许多场景之一,您需要您重新对复制进行完整同步重复,因此您还应该有一个输入参数或某种内容来强制完整同步。
I would always stay in the "log domain" not the "time domain" when working directly with CDC. So track the last LSN of the last run and compare it against sys.fn_cdc_get_min_lsn every time you syncronize.
So if you last synchronized at lsn=100, and the min_lsn=110, then you've got a gap of 10 missing log records.
But this is only one of many scenarios that will require you to reinitialize the replication with a full sync, so you should also have an input parameter or somesuch to force a full sync.