当行依赖于外键值时如何使用 BULK INSERT？

发布于 2024-12-19 07:25:10 字数 2729 浏览 3 评论 0原文

基于此，我考虑使用 BULK INSERT。我现在明白，我必须为要保存到数据库中的每个实体准备一个文件。不管怎样，我仍然想知道这个 BULK INSERT 是否会避免我的系统上的内存问题，如 ServerFault 上引用的问题中所述。

至于 Streets 表，非常简单！我只需要关心两个城市和五个部门作为外键。那么，地址又如何呢？地址表的结构如下：

AddressId int not null identity(1,1) primary key
StreetNumber int null
NumberSuffix_Value int not null DEFAULT 0
StreetId int null references Streets (StreetId)
CityId int not null references Cities (CityId)
SectorId int null references Sectors (SectorId)

正如我在 ServerFault 上所说，我有大约 35,000 个地址要插入。我要记住所有的ID吗？ =P

然后，我现在要插入与地址有关联的公民。

PersonId int not null indentity(1,1) primary key
Surname nvarchar not null
FirstName nvarchar not null
IsActive bit
AddressId int null references Addresses (AddressId)

我唯一能想到的就是强制 ID 为静态值，但是这样一来，我就失去了之前使用 INSERT..SELECT 策略所拥有的灵活性。

那么我的选择是什么？

我强制 ID 始终相同，然后我必须SET IDENTITY_INSERT ON，以便我可以将值强制写入表中，这样我的每个 ID 始终具有相同的 ID行正如此处建议的那样。
如何使用外键批量插入？我在任何地方都找不到这方面的任何文档。 =(

感谢您的帮助！

编辑
我进行了编辑，以便包含最终适合我的 BULK INSERT SQL 指令！

我已准备好 Excel 工作簿，其中包含我需要插入的信息。因此，我简单地创建了一些补充工作表并开始编写公式，以便将信息数据“导入”到这些新工作表中。我的每个实体都有一个。

街道；
地址；
公民们。

至于另外两个实体，不值得批量插入，因为我只有两个城市和五个部门（城市细分）要插入。一旦插入城市和扇区，我记下它们各自的 ID，并开始准备我的记录集以进行批量插入。顺便说一句，使用 Excel 的强大功能来计算值并“导入”外键本身就是一种魅力。之后，我将每个工作表保存到一个单独的 CSV 文件中。然后我的记录就准备好批量化了。

USE [DatabaseName]
GO

delete from Citizens
delete from Addresses
delete from Streets

BULK INSERT Streets
    FROM N'C:\SomeFolder\SomeSubfolder\Streets.csv'
    WITH (
        FIRSTROW = 2
        , KEEPIDENTITY
        , FIELDTERMINATOR = N','
        , ROWTERMINATOR = N'\n'
        , CODEPAGE = N'ACP'
    )
GO

首先
指示开始插入的行号。在我的情况下，我的 CSV 包含列标题，因此第二行是开始的行。除此之外，人们可能想从文件中的任何位置开始，比如说第 15 行。
保留身份
即使表具有标识列，也允许批量插入指定的文件内实体 ID。当您希望使用精确 ID 进行插入时，此参数与插入行之前的 SET INDENTITY_INSERT my_table ON 相同。

至于其他参数，他们自己说。

现在已经解释了这一点，对其余两个实体重复相同的代码以插入地址和公民。由于指定了 KEEPIDENTITY，尽管我的主键已在 SQL Server 中设置为身份，但我的所有外键仍然保持不变。

不过，只需进行一些调整，与 marc_s 在他的回答中所说的完全相同，只需尽快将数据导入到暂存表中，完全没有任何限制。这样，您将使您的生活变得更加轻松，同时遵循良好的做法。 =)

原文

My question is related to this one I asked on ServerFault.

Based on this, I've considered the use of BULK INSERT. I now understand that I have to prepare myself a file for each entities I want to save into the database. No matter what, I still wonder whether this BULK INSERT will avoid the memory issue on my system as described in the referenced question on ServerFault.

As for the Streets table, it's quite simple! I have only two cities and five sectors to care about as the foreign keys. But then, how about the Addresses? The Addresses table is structured like this:

AddressId int not null identity(1,1) primary key
StreetNumber int null
NumberSuffix_Value int not null DEFAULT 0
StreetId int null references Streets (StreetId)
CityId int not null references Cities (CityId)
SectorId int null references Sectors (SectorId)

As I said on ServerFault, I have about 35,000 addresses to insert. Shall I memorize all the IDs? =P

And then, I now have the citizen people to insert who have an association with the addresses.

PersonId int not null indentity(1,1) primary key
Surname nvarchar not null
FirstName nvarchar not null
IsActive bit
AddressId int null references Addresses (AddressId)

The only thing I can think of is to force the IDs to static values, but then, I lose any flexibility that I had with my former approach with the INSERT..SELECT stategy.

What are then my options?

I force the IDs to be always the same, then I have to SET IDENTITY_INSERT ON so that I can force the values into the table, this way I always have the same IDs for each of my rows just as suggested here.
How to BULK INSERT with foreign keys? I can't get any docs on this anywhere. =(

Thanks for your kind assistance!

EDIT
I edited in order to include the BULK INSERT SQL instruction that finally made it for me!

I had my Excel workbook ready with the information I needed to insert. So, I simply created a few supplemental worksheet and began to write formulas in order to "import" the information data to these new sheets. I had one for each of my entities.

Streets;
Addresses;
Citizens.

As for the two other entities, it wasn't worthy to bulk insert them, as I had only two cities and five sectors (cities subdivisions) to insert. Once the both the cities and sectors inserted, I noted their respective IDs and began to ready my record sets for bulk insert. Using the power of Excel to compute the values and to "import" the foreign keys was a charm of itself, by the way. Afterwards, I have saved each of the worksheets to a separated CSV file. My records were then ready to bulked.

USE [DatabaseName]
GO

delete from Citizens
delete from Addresses
delete from Streets

BULK INSERT Streets
    FROM N'C:\SomeFolder\SomeSubfolder\Streets.csv'
    WITH (
        FIRSTROW = 2
        , KEEPIDENTITY
        , FIELDTERMINATOR = N','
        , ROWTERMINATOR = N'\n'
        , CODEPAGE = N'ACP'
    )
GO

FIRSTROW
Indicates the row number at which to begin the insert. In my situation, my CSVs contained the column headers, so the second row was the one to begin with. Aside, one could possibly want to start anywhere in his file, let's say the 15th row.
KEEPIDENTITY
Allows one to bulk-insert specified in-file entity IDs even though the table has an identity column. This parameter is the same as SET INDENTITY_INSERT my_table ON before a row insert when you wish to insert with a precise id.

As for the other parameters, they speak by themselves.

Now that this is explained, the same code was repeated for each of the two remaining entities to insert Addresses and Citizens. And because the KEEPIDENTITY was specified, all of my foreign keys remained still, though my primary keys were set as identities in SQL Server.

Only a few tweaks though, just the exact same thing as marc_s said in his answer, just import your data as fast as you can into a staging table with no restriction at all. This way, you're gonna make your life much easier, while following good practices nevertheless. =)

分享到QQ

分享到微博