SSIS - 数据库字段级别的批量更新

发布于 2024-08-25 16:19:56 字数 849 浏览 8 评论 0原文

这是我们的使命：

接收客户的文件。每个文件包含 1 到 1,000,000 条记录。
记录被加载到暂存区域并应用业务规则验证。
然后，有效记录会按照以下规则以批量方式注入 OLTP 数据库：
- 如果记录不存在（我们有密钥，所以这不是问题），请创建它。
- 如果记录存在，可选择更新每个数据库字段。该决定是根据 3 个因素之一做出的...我认为这些因素是什么并不重要。

我们的主要问题是找到一种有效的方法来选择性地在字段级别更新数据。这适用于大约 12 个不同的数据库表，每个表中有 10 到 150 个字段（原始数据库设计还有很多不足之处，但事实就是如此）。

我们的第一个尝试是引入一个表，该表反映暂存环境（每个系统字段的暂存中有一个字段）并包含一个屏蔽标志。掩蔽标志的值代表 3 个因素。

然后我们进行了类似于...的更新...

UPDATE OLTPTable1 SET Field1 = CASE 
  WHEN Mask.Field1 = 0 THEN Staging.Field1
  WHEN Mask.Field1 = 1 THEN COALESCE( Staging.Field1 , OLTPTable1.Field1 )
  WHEN Mask.Field1 = 2 THEN COALESCE( OLTPTable1.Field1 , Staging.Field1 )
...

正如您可以想象的那样，性能相当可怕。

有人解决过类似的需求吗？

我们是一家 MS 商店，使用 Windows 服务来启动处理数据处理的 SSIS 包。不幸的是，我们在这方面几乎是新手。

原文

Here's our mission:

Receive files from clients. Each file contains anywhere from 1 to 1,000,000 records.
Records are loaded to a staging area and business-rule validation is applied.
Valid records are then pumped into an OLTP database in a batch fashion, with the following rules:
- If record does not exist (we have a key, so this isn't an issue), create it.
- If record exists, optionally update each database field. The decision is made based on one of 3 factors...I don't believe it's important what those factors are.

Our main problem is finding an efficient method of optionally updating the data at a field level. This is applicable across ~12 different database tables, with anywhere from 10 to 150 fields in each table (original DB design leaves much to be desired, but it is what it is).

Our first attempt has been to introduce a table that mirrors the staging environment (1 field in staging for each system field) and contains a masking flag. The value of the masking flag represents the 3 factors.

We've then put an UPDATE similar to...

UPDATE OLTPTable1 SET Field1 = CASE 
  WHEN Mask.Field1 = 0 THEN Staging.Field1
  WHEN Mask.Field1 = 1 THEN COALESCE( Staging.Field1 , OLTPTable1.Field1 )
  WHEN Mask.Field1 = 2 THEN COALESCE( OLTPTable1.Field1 , Staging.Field1 )
...

As you can imagine, the performance is rather horrendous.

Has anyone tackled a similar requirement?

We're a MS shop using a Windows Service to launch SSIS packages that handle the data processing. Unfortunately, we're pretty much novices at this stuff.

分享到QQ

分享到微博