SQL批量插入具有父/子关系,顺序是否保留?

发布于 2024-11-16 10:41:58 字数 1861 浏览 3 评论 0原文

与下面提到的其他问题类似,我有两个具有以下结构的表:

create table parent (
   recno int identity(1,1) primary key not null,
   groupCode int,
   parentdata varchar(80)
);

create table child (
   parentrecno int not null,
   childdata varchar(80)
)

我需要快速将几十万条记录插入到这些表中,并且这些表保存与此插入无关的数百万条其他记录,并且永远不会安静。由于父/子性质,它(看起来)不是 SqlBulkCopy 的良好候选者。

在 C# 中,使用 SqlCommandINSERT 每秒插入大约 400-500 条记录,这有点太慢了。伪代码:

 foreach(Record r in parentRecords)
 {
      Insert Fields from r into SqlCommand Parameters but not "recno"
      Call ExecuteScalar to insert and fetch the inserted identity value (recno)
      foreach(ChildRecord cr in parentRecords.Children)
      {
          Insert Fields from cr into SqlCommand Parameters
          Insert the identity value (recno) from above into Parameters 
                                                       (as parentrecno)
          Call ExecuteNonQuery to insert the record
      }   
 }

读完其他帖子后,我想到了一个想法。附加到父记录的 groupCode 对于我要插入的父记录集是唯一的。是否可以:

  1. 使用 SqlBulkCopy 批量插入父记录,让插入像往常一样自动生成 recno 标识字段。
  2. 仅对插入的记录执行SELECT

    从父级中选择recno,其中groupCode = @thisgroup order by recno;
    
  3. 使用检索到的值填充内存中子记录的parentrecno 字段

  4. 使用 SqlBulkCopy 批量插入子记录

这将依赖于父记录进入SQL 表的顺序与它们位于原始数据表中(并且以相同的顺序分配标识值)。 这是我可以信赖的东西吗?

相关问题:

如何更新数据集父项和父项具有自动生成的身份密钥的子表?

SqlBulkCopy和在标识列上具有父/子关系的数据表

Similar to these other questions noted below, I have two tables with the structure:

create table parent (
   recno int identity(1,1) primary key not null,
   groupCode int,
   parentdata varchar(80)
);

create table child (
   parentrecno int not null,
   childdata varchar(80)
)

I need to insert a few hundred thousand records quickly into these tables -- and the tables hold millions of other records unrelated to this insert and are never quiet. Because of the parent/child nature, it's not a good candidate (it seems) for SqlBulkCopy.

In C# using SqlCommand with INSERT I'm getting about 400-500 records/second inserted, and this is a bit too slow. Pseudocode:

 foreach(Record r in parentRecords)
 {
      Insert Fields from r into SqlCommand Parameters but not "recno"
      Call ExecuteScalar to insert and fetch the inserted identity value (recno)
      foreach(ChildRecord cr in parentRecords.Children)
      {
          Insert Fields from cr into SqlCommand Parameters
          Insert the identity value (recno) from above into Parameters 
                                                       (as parentrecno)
          Call ExecuteNonQuery to insert the record
      }   
 }

After reading those other posts, a though occurred to me. The groupCode attached to the parent records is unique to that set of parent records I'm inserting. Would it work to:

  1. Bulk insert the parent records with SqlBulkCopy, letting the insert auto-generate the recno identity field as usual.
  2. Perform a SELECT on just the inserted records:

    select recno from parent where groupCode = @thisgroup order by recno;
    
  3. Use the retrieved values to fill in the parentrecno fields for the child records in memory

  4. Bulk insert the child records with SqlBulkCopy

This would rely on the parent records going into the SQL table in the same order as they are in the original DataTable (and the identity values being assigned in that same order). Is this something I can rely on?

Related questions:

How to update Dataset Parent & Child tables with Autogenerated Identity Key?

SqlBulkCopy and DataTables with Parent/Child Relation on Identity Column

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

静若繁花 2024-11-23 10:41:58

创建两个与目标表具有相同结构的临时表,但不要在 recno 列上使用标识。

create table parentTmp (
   recno int,
   groupCode int,
   parentdata varchar(80)
);

create table childTmp (
   parentrecno int not null,
   childdata varchar(80)
)

将数据批量加载到暂存表,保持 recno/parentrecno 值不变。

然后您可以使用合并输出 用于从临时表中移动数据。

-- Table variable to hold mapping between 
-- SourceRecno and TargetRecno
declare @recno table(SourceRecno int, TargetRecno int);

-- Merge data from parentTmp to parent
-- Output old and new recno to @recno
merge parent T
using parentTmp S
on 0=1
when not matched then
  insert (groupCode, parentdata)
    values (S.groupCode, S.parentData)
output S.recno, inserted.recno into @recno;

-- Copy data from childTmp to child
-- Use @recno to get the new recno
insert into child(parentrecno, childdata)
select R.TargetRecno, C.childdata
from childTmp as C
  inner join @recno as R
    on C.parentrecno = R.SourceRecno;

这仅适用于 SQL Server 2008(我推测是更高版本)。

Create two staging tables with the same structure as your target tables but don't use identity on recno column.

create table parentTmp (
   recno int,
   groupCode int,
   parentdata varchar(80)
);

create table childTmp (
   parentrecno int not null,
   childdata varchar(80)
)

Bulk load your data to the staging tables, keeping the recno/parentrecno values as is.

Then you can use merge and output to move the data from the staging tables.

-- Table variable to hold mapping between 
-- SourceRecno and TargetRecno
declare @recno table(SourceRecno int, TargetRecno int);

-- Merge data from parentTmp to parent
-- Output old and new recno to @recno
merge parent T
using parentTmp S
on 0=1
when not matched then
  insert (groupCode, parentdata)
    values (S.groupCode, S.parentData)
output S.recno, inserted.recno into @recno;

-- Copy data from childTmp to child
-- Use @recno to get the new recno
insert into child(parentrecno, childdata)
select R.TargetRecno, C.childdata
from childTmp as C
  inner join @recno as R
    on C.parentrecno = R.SourceRecno;

This will only work in SQL Server 2008 (and later I presume).

清风夜微凉 2024-11-23 10:41:58

这不是绝对的批量插入,而是将所有子数据与父数据同时插入,仅与数据库进行 1 次往返。

insert into parent(groupcode, parentdata) values(1, 'parent data');
insert into child(parentrecno, childdata) select parentrecno, childdata from (
    select SCOPE_IDENTITY() as parentrecno, 'child data 1' as childdata
    union
    select SCOPE_IDENTITY() as parentrecno, 'child data 2' as childdata
    union
    select SCOPE_IDENTITY() as parentrecno, 'child data 3' as childdata
) childrendata;

您可以在 C# 代码中构建这样的脚本,然后对每个父级执行一个请求。

请注意,如果已知子数据量很大,这可能不是一个好方法。不知道细节,但我确信sql脚本的大小不能无限增长。

This is not an absolut bulk insert, but instead it inserts all child data at the same time with the parent data, making only 1 round trip to the DB.

insert into parent(groupcode, parentdata) values(1, 'parent data');
insert into child(parentrecno, childdata) select parentrecno, childdata from (
    select SCOPE_IDENTITY() as parentrecno, 'child data 1' as childdata
    union
    select SCOPE_IDENTITY() as parentrecno, 'child data 2' as childdata
    union
    select SCOPE_IDENTITY() as parentrecno, 'child data 3' as childdata
) childrendata;

You can build scripts like this in your C# code, and then perform one request per parent.

Be aware that this may not be a good approach if the amount of child data is known to be large. Don't know the details, but I'm sure that the size of the sql script can't grow indefinitely.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文