SQL Server .NET SqlBulkCopy 类在异常后继续运行
我正在使用 SqlBulkCopy 插入大量数据。数据源可能有一些重复的行。我的目标表有一个唯一性索引。
当出现第一个重复行时,SqlBulkCopy 会引发异常并回滚内部事务。我希望它忽略异常并继续插入。 (仅当例外是重复行时)。
关于这个问题有一些问题,但他们都在寻找报告重复的行,我只是不关心它们。
I'm inserting lot of data with a SqlBulkCopy. The source of the data may have some duplicated rows. I the destination table I have an index of uniqueness.
When the first duplicated row appears, the SqlBulkCopy throws an exception and rollback the internal transaction. I want it to ignore the exception and keep inserting. (ONLY if the exception is for the duplicated row).
There are some questions about this issue, but they all are looking for report duplicated rows, I just don't care about them.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在进行批量插入之前修复该问题。批量插入是专门定义的,不处理这样的事情。为此,它将 SQL Server 置于某种批量插入模式,这种模式更快……但有局限性。所有关键违规都必须在插入之前处理。
您想要做的是插入临时表(允许双精度),然后合并到主表中(使用 MERGE 命令)。
Fix the issue before doing the bulk insert. Bulk insert is specifically defined to not handle stuff like this. For that it puts SQL Server into some bulk insert mode which is faster... but has limitations. All key violations must be handeld BEFORE the insert.
What you my want to do is insert into a staging table (doubles allowed), then MERGE into the main table (using the MERGE command).
这是 SqlBulkCopy 的权衡之一 - 当我知道我有干净的数据时我会使用它。如果您需要能够在没有干净数据的情况下很好地处理错误,那么其他方法“更好”(尽管这是性能的权衡)。
继您的其他问题之后,SqlDataAdapter 方法将为您提供这种能力,因为您可以设置一个ContinueOnError 属性,该属性将允许进程在出现此类错误时继续 - 非常方便。当我需要处理来自 .NET 的不太干净/可能有问题的数据时,我会使用这种方法。但正如我之前提到的,您会看到性能成本。
如果您想要绝对的性能,并满足保持独立的原始要求,那么您应该在加载到数据库之前对数据进行重复数据删除。或者使用 TomTom 建议的方法
This is one of the trade-offs with SqlBulkCopy - I use this when I know I have clean data. If you need to be able to deal with errors nicely in situations like this where you don't have clean data, then other approaches are "better" (it's a tradeoff though for performance).
Following on from your other question, SqlDataAdapter approach would give you this ability as there is a ContinueOnError property you can set that will allow the process to continue in the event of errors like this - very handy. I use this approach when I need to handle not-so-clean/potentially problematic data from .NET. But as I linked to previously, you will see a perf cost.
If you want absolute performance, and to meet your original requirements of keeping separate, then you should dedupe the data before then loading into the DB. Or use an approach like TomTom suggested