SQL批量插入具有父/子关系,顺序是否保留?
与下面提到的其他问题类似,我有两个具有以下结构的表:
create table parent (
recno int identity(1,1) primary key not null,
groupCode int,
parentdata varchar(80)
);
create table child (
parentrecno int not null,
childdata varchar(80)
)
我需要快速将几十万条记录插入到这些表中,并且这些表保存与此插入无关的数百万条其他记录,并且永远不会安静。由于父/子性质,它(看起来)不是 SqlBulkCopy
的良好候选者。
在 C# 中,使用 SqlCommand
和 INSERT
每秒插入大约 400-500 条记录,这有点太慢了。伪代码:
foreach(Record r in parentRecords)
{
Insert Fields from r into SqlCommand Parameters but not "recno"
Call ExecuteScalar to insert and fetch the inserted identity value (recno)
foreach(ChildRecord cr in parentRecords.Children)
{
Insert Fields from cr into SqlCommand Parameters
Insert the identity value (recno) from above into Parameters
(as parentrecno)
Call ExecuteNonQuery to insert the record
}
}
读完其他帖子后,我想到了一个想法。附加到父记录的 groupCode
对于我要插入的父记录集是唯一的。是否可以:
- 使用
SqlBulkCopy
批量插入父记录,让插入像往常一样自动生成recno
标识字段。 仅对插入的记录执行
SELECT
:从父级中选择recno,其中groupCode = @thisgroup order by recno;
使用检索到的值填充内存中子记录的
parentrecno
字段- 使用
SqlBulkCopy
批量插入子记录
这将依赖于父记录进入SQL 表的顺序与它们位于原始数据表中(并且以相同的顺序分配标识值)。 这是我可以信赖的东西吗?
相关问题:
Similar to these other questions noted below, I have two tables with the structure:
create table parent (
recno int identity(1,1) primary key not null,
groupCode int,
parentdata varchar(80)
);
create table child (
parentrecno int not null,
childdata varchar(80)
)
I need to insert a few hundred thousand records quickly into these tables -- and the tables hold millions of other records unrelated to this insert and are never quiet. Because of the parent/child nature, it's not a good candidate (it seems) for SqlBulkCopy
.
In C# using SqlCommand
with INSERT
I'm getting about 400-500 records/second inserted, and this is a bit too slow. Pseudocode:
foreach(Record r in parentRecords)
{
Insert Fields from r into SqlCommand Parameters but not "recno"
Call ExecuteScalar to insert and fetch the inserted identity value (recno)
foreach(ChildRecord cr in parentRecords.Children)
{
Insert Fields from cr into SqlCommand Parameters
Insert the identity value (recno) from above into Parameters
(as parentrecno)
Call ExecuteNonQuery to insert the record
}
}
After reading those other posts, a though occurred to me. The groupCode
attached to the parent records is unique to that set of parent records I'm inserting. Would it work to:
- Bulk insert the parent records with
SqlBulkCopy
, letting the insert auto-generate therecno
identity field as usual. Perform a
SELECT
on just the inserted records:select recno from parent where groupCode = @thisgroup order by recno;
Use the retrieved values to fill in the
parentrecno
fields for the child records in memory- Bulk insert the child records with
SqlBulkCopy
This would rely on the parent records going into the SQL table in the same order as they are in the original DataTable (and the identity values being assigned in that same order). Is this something I can rely on?
Related questions:
How to update Dataset Parent & Child tables with Autogenerated Identity Key?
SqlBulkCopy and DataTables with Parent/Child Relation on Identity Column
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
创建两个与目标表具有相同结构的临时表,但不要在 recno 列上使用标识。
将数据批量加载到暂存表,保持 recno/parentrecno 值不变。
然后您可以使用合并和输出 用于从临时表中移动数据。
这仅适用于 SQL Server 2008(我推测是更高版本)。
Create two staging tables with the same structure as your target tables but don't use identity on recno column.
Bulk load your data to the staging tables, keeping the recno/parentrecno values as is.
Then you can use merge and output to move the data from the staging tables.
This will only work in SQL Server 2008 (and later I presume).
这不是绝对的批量插入,而是将所有子数据与父数据同时插入,仅与数据库进行 1 次往返。
您可以在 C# 代码中构建这样的脚本,然后对每个父级执行一个请求。
请注意,如果已知子数据量很大,这可能不是一个好方法。不知道细节,但我确信sql脚本的大小不能无限增长。
This is not an absolut bulk insert, but instead it inserts all child data at the same time with the parent data, making only 1 round trip to the DB.
You can build scripts like this in your C# code, and then perform one request per parent.
Be aware that this may not be a good approach if the amount of child data is known to be large. Don't know the details, but I'm sure that the size of the sql script can't grow indefinitely.