使用 SqlBulkCopy 插入行

发布于 2024-10-12 12:19:13 字数 1444 浏览 7 评论 0原文

我正在将一些 Linq to Sql 代码切换为使用 SqlBulkCopy，问题是我需要将数千行插入两个表中。

该服务获取您批量的 10,000 个链接（从站点地图、反向链接构建器等导入），并将它们分成每个 feed 为 X 的 RSS feed 进行聚合。问题是，我已经有一个包含3200 万行的表。如果我正在执行 linq to sql 插入，则根据站点流量需要 5 到 10 分钟才能加载 10,000 个链接。

该结构非常基本。

Feed：Id bigint (PK)、标题 varchar(1000)、描述 varchar(1000)、发布日期时间、聚合日期时间 null、ShortCode varchar(8) [已过时，不再插入，但用于遗留data]

项目：Id bigint (PK)、FeedId bigint (FK)、标题 varchar(1000)、描述 varchar(1000)、发布日期时间、ShortCode varchar(8) [已过时，不再插入，但用于遗留数据]，ShortId bigint null [插入后更新为等于Id（用于分区）]

FutureItems：Id bigint (PK)、FeedId bigint (FK)、Title varchar(1000)、说明 varchar(1000)、发布日期时间、ShortCode varchar(8) [已过时，不再插入，但用于遗留数据]、ShortId bigint null [插入后更新为相等的 Id（用于分区）]

OldItems：Id bigint (PK)、FeedId bigint (FK)、标题 varchar(1000)、描述 varchar(1000)、发布日期时间、ShortCode varchar(8) [已过时，不再插入，但用于遗留数据]、ShortId bigint null [插入后更新为等于 Id（用于分区）]

因此，如果 Feed 大小为 20，则将 500 次插入到 Feeds 表中，然后将 10000 次插入到 Items 表中，然后运行更新以设置 ShortId等于 ID。每晚运行一次作业，将数据分离到其他两个表中，并将未来的项目转移到 Items 表中。

我读到 SqlBulkCopy 可以在几分钟内完成 2000 万行，但我找不到任何在具有 FK 约束的多个表中执行此操作的好示例。

我们的 SQL 服务器对于这个应用程序来说是一个“怪物”。它是 SQL 2008 R2 Web、Windows 2008 R2 Enterprise、12GB RAM、双 4 核 Xeon @ 2.8ghz。

我们的网络服务器是一个克隆，没有数据库服务。

插入链接时CPU 运行大约85%，并且数据库填满RAM。

如果SqlBulkCopy不好，欢迎提出任何建议，我们有付费客户，他们很生气，而我不是DBA，只是一个普通的老程序员。

原文

I am switching some of my Linq to Sql code to use SqlBulkCopy, and problem is I need to do two inserts of multiple thousands of rows into two tables.

The service takes your batch of 10,000 links (imported from sitemap, backlink builders, etc), and chops them into RSS feeds of X per feed for aggregation. Problem is, I already have a table of 32 million rows. If i am doing linq to sql inserts, it takes depending on site traffic anywhere between 5 and 10 mintues to load 10,000 links.

The structure is very basic.

Feeds: Id bigint (PK), Title varchar(1000), Description varchar(1000), Published datetime, Aggregated datetime null, ShortCode varchar(8) [antiquated, not inserted anymore, but used for legacy data]

Items: Id bigint (PK), FeedId bigint (FK), Title varchar(1000), Description varchar(1000), Published datetime, ShortCode varchar(8) [antiquated, not inserted anymore, but used for legacy data], ShortId bigint null [updated after insert to equal Id (used in partitioning)]

FutureItems: Id bigint (PK), FeedId bigint (FK), Title varchar(1000), Description varchar(1000), Published datetime, ShortCode varchar(8) [antiquated, not inserted anymore, but used for legacy data], ShortId bigint null [updated after insert to equal Id (used in partitioning)]

OldItems: Id bigint (PK), FeedId bigint (FK), Title varchar(1000), Description varchar(1000), Published datetime, ShortCode varchar(8) [antiquated, not inserted anymore, but used for legacy data], ShortId bigint null [updated after insert to equal Id (used in partitioning)]

So if you have a feed size of 20, you get 500 inserts into the Feeds table, then 10000 inserted into the Items table, then and update runs to set the ShortId equal to the Id. Once a night, a job runs that separates the data into the other two tables, and shift future items into the Items table.

I read that SqlBulkCopy can do 20 million rows in the matter of mintues, but I can't find any good examples of doing it into multiple tables with a FK Constraint.

Our SQL server is a "monster" especially for this application. It is SQL 2008 R2 Web, Windows 2008 R2 Enterprise, 12GB Ram, Dual 4 core Xeons @ 2.8ghz.

Our web server is a clone without the database service.

The CPU runs about 85% when inserting links, and the database fills the RAM.

If SqlBulkCopy isn't good, any suggestion is welcome, we have paying customers who are getting mad, and I am not a DBA, just a plain-old-programmer.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

冷心人i 2024-10-19 12:19:13

SqlBulkCopy确实比普通插入要快。但速度更快，因为它可以将每秒运行 1000 次插入的作业转换为每秒运行 10000 次的作业。如果你只能在 10 分钟内完成 10000 个链接，那么你一定遇到了不同的问题，而批量复制不太可能解决这些问题。

您需要首先调查为什么插入 10000 个链接需要如此长的时间。只有在了解这一点之后，您才能进行调用来确定迁移到 SqlBulkCopy 是否是一个解决方案。我知道您不是 DBA，但我将向您提供一份“dbaish”白皮书，用于解决 SQL Server 性能问题：等待和队列。这不是一个千篇一律的解决方案，实际上是一种教您如何识别 SQL Server 性能瓶颈的方法。

并解决您的问题：当存在约束时如何使用 SqlBulkCopy？更普遍的问题是，当约束到位时，如何进行批量插入操作？对于严重的卷，我们实际上会禁用约束，执行批量上传，然后启用回约束。为了以最少的停机时间实现更简化的在线操作（在禁用约束的期间，数据库基本上处于“停机”状态），可以使用一种不同的策略，即将数据预加载到临时表中，对其进行验证，然后使用分区切换操作，请参阅使用分区切换高效传输数据。

回复收藏 0 原文