替换 SQL Server 中的 3 级深度嵌套游标

发布于 2025-01-11 01:24:07 字数 2604 浏览 4 评论 0原文

我有三个 SQL Server 表，需要循环和更新。我使用游标成功完成了此操作，但速度太慢，几乎毫无意义，因为包含所有要循环的数据的主表长度超过 1,000 行。

这些表（带有一些示例数据）：

-- The PK is InvoiceId and the IsMajorPart is '0' or '1'.
-- The MajorPartId and SubPartId1 to 4 are "technically" FKs for PartId but aren't hooked up and will not be ever due to some external issues outside of scope.
-- The part Id's can be NULL or empty.
-- This table exists elsewhere and is loaded with Id's being varchars but in transfering they will be going in as int's which is the proper way.
CREATE TABLE dbo.Invoices(
    InvoicdeId varchar(50),
    PartName varchar(255),
    IsMajorPart varchar(1)
    MajorPartId varchar(50),
    SubPartId1 varchar(50),
    SubPartId2 varchar(50),
    SubPartId3 varchar(50),
    SubPartId4 varchar(50));

-- Sampe inserts
INSERT INTO dbo.Invoices VALUES ('1', 'A Part', '0', '', '100', '105', '' ,''):
INSERT INTO dbo.Invoices VALUES ('5', 'E Part', '1', '101', '110', '', '' ,''):
INSERT INTO dbo.Invoices VALUES ('11', 'Z Part', '1', '201', '100', '115', '' ,''):


-- Essentially the old table above is being moved into a normalized, correct tables below.


- The PK is the PartId
CREATE TABLE dbo.Parts
    PartsId int,
    PartName varchar(255)

-- Sampe inserts (that will be updated or inserted by looping through the first table)
INSERT INTO dbo.Parts VALUES (100,'A Part'):
INSERT INTO dbo.Parts VALUES (110,'B Part'):
INSERT INTO dbo.Parts VALUES (201,'C Part'):


-- The PK is the combination of InvoiceId and PartId
CREATE TABLE dbo.InvoiceToParts
    InvoiceId int,
    PartsId int,
    IsMajorPart bit);

-- Sampe inserts (that will be inserted from the first table but conflicts might occur if an InvoiceId from the first table has 2 PartId's that are the same)
INSERT INTO dbo.Parts VALUES (1, 100, 0):
INSERT INTO dbo.Parts VALUES (5, 100, 1):
INSERT INTO dbo.Parts VALUES (17, 201, 0):

上面的示例 INSERT 只是用于查看表中内容的数据示例。

将 Invoices（我不关心此表发生什么情况）移动到 Parts 和 InvoiceToParts 的正确表中的规则如下（最后两个表是我唯一关心的。

循环遍历 Invoices 并获取所有数据。
首先，找出 IsMajorPart 是否为“1”，然后获取。这MajorPartId。
如果 PartName 不存在，则将 MajorPartId 推送到 Parts 表中
。 >InvoiceToParts 查看 InvoiceId 和 PartId 的 PK 是否存在，
如果存在，
则将 IsMaorPart 更新为“1”。他们不存在，
接下来对所有 SubPartId1 到 SubPartId4 执行相同的过程，

我有一个嵌套的 3 级游标，其性能运行了超过 30 分钟。我停止了它，因为它还没有接近完成，并且正在耗尽所有资源，我正在尝试寻找一种更快的方法来做到这一点。 Invoices 表最多可以包含大约 5,000 行。

原文

I have three SQL Server tables that I need to loop trhough and update. I did it successfully with a cursor but it is so slow that it is pretty pointless sincethe main table with all the data to loop through is over 1,000 rows long.

The tables are (with some sample data):

-- The PK is InvoiceId and the IsMajorPart is '0' or '1'.
-- The MajorPartId and SubPartId1 to 4 are "technically" FKs for PartId but aren't hooked up and will not be ever due to some external issues outside of scope.
-- The part Id's can be NULL or empty.
-- This table exists elsewhere and is loaded with Id's being varchars but in transfering they will be going in as int's which is the proper way.
CREATE TABLE dbo.Invoices(
    InvoicdeId varchar(50),
    PartName varchar(255),
    IsMajorPart varchar(1)
    MajorPartId varchar(50),
    SubPartId1 varchar(50),
    SubPartId2 varchar(50),
    SubPartId3 varchar(50),
    SubPartId4 varchar(50));

-- Sampe inserts
INSERT INTO dbo.Invoices VALUES ('1', 'A Part', '0', '', '100', '105', '' ,''):
INSERT INTO dbo.Invoices VALUES ('5', 'E Part', '1', '101', '110', '', '' ,''):
INSERT INTO dbo.Invoices VALUES ('11', 'Z Part', '1', '201', '100', '115', '' ,''):


-- Essentially the old table above is being moved into a normalized, correct tables below.


- The PK is the PartId
CREATE TABLE dbo.Parts
    PartsId int,
    PartName varchar(255)

-- Sampe inserts (that will be updated or inserted by looping through the first table)
INSERT INTO dbo.Parts VALUES (100,'A Part'):
INSERT INTO dbo.Parts VALUES (110,'B Part'):
INSERT INTO dbo.Parts VALUES (201,'C Part'):


-- The PK is the combination of InvoiceId and PartId
CREATE TABLE dbo.InvoiceToParts
    InvoiceId int,
    PartsId int,
    IsMajorPart bit);

-- Sampe inserts (that will be inserted from the first table but conflicts might occur if an InvoiceId from the first table has 2 PartId's that are the same)
INSERT INTO dbo.Parts VALUES (1, 100, 0):
INSERT INTO dbo.Parts VALUES (5, 100, 1):
INSERT INTO dbo.Parts VALUES (17, 201, 0):

The sample INSERTs above are just samples of the data for seeing what is in the tables.

The rules to move Invoices (I don't care what happens to this table), into the correct tables of Parts and InvoiceToParts are below (and these last two tables are the only ones that I care about.

Loop through Invoices and get all the data.
First, find out if IsMajorPart is '1' and then get the MajorPartId.
Push the MajorPartId with PartName in Parts table if it DOESN'T already exist.
Next check InvoiceToParts to see if the PK of InvoiceId and PartId exist.
If they do, update IsMaorPart to '1'.
If they don't exist, INSERT it.
Next do the same process for all SubPartId1 to SubPartId4.

I have a nested 3-level cursor which performance-wise ran for over 30min before I stopped it as it wasn't even close to finishing and was sucking up all the resources. I am trying to look for a faster way to do this. The Invoices table can have up to about 5,000 rows in it.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

奈何桥上唱咆哮 2025-01-18 01:24:07

您需要对数据进行逆透视，然后执行所谓的 UPSERT，该操作有两个步骤：

如果存在，则更新记录
如果不存在，则插入记录

如果您在线搜索 UPSERT

表的示例，则会有大量示例设置

DROP TABLE IF EXISTS #Invoice
DROP TABLE IF EXISTS #Unpivot
DROP TABLE IF EXISTS #InvoiceToParts
DROP TABLE IF EXISTS #Parts


CREATE TABLE #Parts(
    PartsId int,
    PartName varchar(255)
)

CREATE TABLE #InvoiceToParts(
    InvoiceId int,
    PartsId int,
    IsMajorPart bit
);

CREATE TABLE #Invoice(
    InvoiceId varchar(50),
    PartName varchar(255),
    IsMajorPart varchar(1),
    MajorPartsID varchar(50),
    SubPartsID1 varchar(50),
    SubPartsID2 varchar(50),
    SubPartsID3 varchar(50),
    SubPartsID4 varchar(50)
);

INSERT INTO #Invoice
VALUES ('1', 'A Part', '0', '', '100', '105', '' ,'')
,('5', 'E Part', '1', '101', '110', '', '' ,'')
,('11', 'Z Part', '1', '201', '100', '115', '' ,'')

SQL 来处理数据

将首先对数据进行逆透视，然后首先加载到 Parts 表中，以便在插入到联结表 InvoicetoParts 之前可以引用 ID

SELECT A.InvoiceId
    ,B.*
INTO #Unpivot
FROM #Invoice AS A
CROSS APPLY (
    VALUES 
    (NULLIF(MajorPartsID,''),PartName,IsMajorPart)
    ,(NULLIF(SubPartsID1,''),NULL,0)
    ,(NULLIF(SubPartsID2,''),NULL,0)
    ,(NULLIF(SubPartsID3,''),NULL,0)
    ,(NULLIF(SubPartsID4,''),NULL,0)
) AS B(PartsID,PartName,IsMajorPart)
WHERE B.PartsID IS NOT NULL /*If not data, filter out*/

/*INSERT into table Parts if not exists*/
INSERT INTO #Parts
SELECT PartsID,PartName
FROM #Unpivot AS A
WHERE A.IsMajorPart = 1
AND NOT EXISTS (
    SELECT *
    FROM #Parts AS DTA
    WHERE A.PartsID = DTA.PartsID
    )
GROUP BY PartsID,PartName

/*UPSERT into table dbo.InvoiceParts*/
UPDATE #InvoiceToParts
SET IsMajorPart = B.IsMajorPart
FROM #InvoiceToParts AS A
INNER JOIN #Unpivot AS B
     ON A.InvoiceId = B.InvoiceId
     AND A.PartsId = B.PartsID

INSERT INTO #InvoiceToParts(InvoiceId,PartsId,IsMajorPart)
SELECT InvoiceId
    ,PartsId
    ,IsMajorPart
FROM #Unpivot AS A
WHERE NOT EXISTS (
    SELECT *
    FROM #InvoiceToParts AS DTA
    WHERE A.InvoiceId = DTA.InvoiceID
        AND A.PartsID = DTA.PartsID
    )

SELECT *
FROM #InvoiceToParts

SELECT *
FROM #Parts

You need to unpivot your data and then just do what is called an UPSERT, which has two steps:

If exists, update record(s)
If not exists, insert record(s)

Plenty of examples if you search for examples online for UPSERT

Table Setup

DROP TABLE IF EXISTS #Invoice
DROP TABLE IF EXISTS #Unpivot
DROP TABLE IF EXISTS #InvoiceToParts
DROP TABLE IF EXISTS #Parts


CREATE TABLE #Parts(
    PartsId int,
    PartName varchar(255)
)

CREATE TABLE #InvoiceToParts(
    InvoiceId int,
    PartsId int,
    IsMajorPart bit
);

CREATE TABLE #Invoice(
    InvoiceId varchar(50),
    PartName varchar(255),
    IsMajorPart varchar(1),
    MajorPartsID varchar(50),
    SubPartsID1 varchar(50),
    SubPartsID2 varchar(50),
    SubPartsID3 varchar(50),
    SubPartsID4 varchar(50)
);

INSERT INTO #Invoice
VALUES ('1', 'A Part', '0', '', '100', '105', '' ,'')
,('5', 'E Part', '1', '101', '110', '', '' ,'')
,('11', 'Z Part', '1', '201', '100', '115', '' ,'')

SQL to Process Data

Will first unpivot the data, then load into Parts table first so the ID's can be referenced before inserting into the junction table InvoicetoParts

SELECT A.InvoiceId
    ,B.*
INTO #Unpivot
FROM #Invoice AS A
CROSS APPLY (
    VALUES 
    (NULLIF(MajorPartsID,''),PartName,IsMajorPart)
    ,(NULLIF(SubPartsID1,''),NULL,0)
    ,(NULLIF(SubPartsID2,''),NULL,0)
    ,(NULLIF(SubPartsID3,''),NULL,0)
    ,(NULLIF(SubPartsID4,''),NULL,0)
) AS B(PartsID,PartName,IsMajorPart)
WHERE B.PartsID IS NOT NULL /*If not data, filter out*/

/*INSERT into table Parts if not exists*/
INSERT INTO #Parts
SELECT PartsID,PartName
FROM #Unpivot AS A
WHERE A.IsMajorPart = 1
AND NOT EXISTS (
    SELECT *
    FROM #Parts AS DTA
    WHERE A.PartsID = DTA.PartsID
    )
GROUP BY PartsID,PartName

/*UPSERT into table dbo.InvoiceParts*/
UPDATE #InvoiceToParts
SET IsMajorPart = B.IsMajorPart
FROM #InvoiceToParts AS A
INNER JOIN #Unpivot AS B
     ON A.InvoiceId = B.InvoiceId
     AND A.PartsId = B.PartsID

INSERT INTO #InvoiceToParts(InvoiceId,PartsId,IsMajorPart)
SELECT InvoiceId
    ,PartsId
    ,IsMajorPart
FROM #Unpivot AS A
WHERE NOT EXISTS (
    SELECT *
    FROM #InvoiceToParts AS DTA
    WHERE A.InvoiceId = DTA.InvoiceID
        AND A.PartsID = DTA.PartsID
    )

SELECT *
FROM #InvoiceToParts

SELECT *
FROM #Parts

回复收藏 0 原文

~没有更多了~