使用 SSIS 或仅使用普通 T-SQL 导入和验证 XML 文件?

发布于 2024-12-01 07:54:48 字数 1431 浏览 0 评论 0原文

将 XML 文件导入并验证到 SQL Server 中的单个表(扁平化)时的最佳实践是什么?

我有一个 XML 文件,其中包含大约 15 种复杂类型,它们都与单个父元素相关。 SSIS 设计可能如下所示: SSIS 但所有这些 (15) 个连接变得非常复杂。

将 T-SQL 代码写入以下内容可能是一个更好的主意:
1) 将 XML 导入到 XML 类型并链接到 XSD 架构的列中。
2) 使用此代码:

TRUNCATE TABLE XML_Import
INSERT INTO XML_Import(ImportDateTime, XmlData)
SELECT GETDATE(), XmlData 
FROM
(
    SELECT  * 
    FROM    OPENROWSET (BULK 'c:\XML-Data.xml', SINGLE_BLOB) AS XMLDATA
) AS FileImport (XMLDATA)

delete from dbo.UserFlat
INSERT INTO dbo.UserFlat
SELECT
    user.value('(UserIdentifier)', 'varchar(8)') as UserIdentifier,
    user.value('(Emailaddress)', 'varchar(70)') as Emailaddress,
    businessaddress.value('(Fax)', 'varchar(70)') as Fax,
    employment.value('(EmploymentData)', 'varchar(8)') as EmploymentData,
    -- More values here ...
FROM  
    XML_Import CROSS APPLY
    XmlData.nodes('//user') AS User(user) CROSS APPLY
    user.nodes('BusinessAddress') AS BusinessAddress(businessaddress) CROSS APPLY
    user.nodes('Employment') AS Employment(employment)
    -- More 'joins' here ...

填充“UserFlat”表?
一些缺点是您必须手动键入 SQLcode,但这里的优点是我可以更直接地控制元素的处理和转换方式。但我不知道在SSIS中处理XML和使用T-SQL XML语句处理XML之间是否有任何性能差异。


请注意,其他一些要求是:

  1. 错误处理:如果发生错误,必须向某人发送电子邮件。
  2. 能够处理具有特定文件名模式的多个输入文件:XML_{date}_{time}.xml
  3. 将处理后的 XML 文件移动到不同的文件夹。

请指教。

What is the best practice when importing and validating an XML file to a single table (flattened) in SQL Server ?

I've a XML file which contains about 15 complex types which are all related to a single parent element.
The SSIS design could look like this:
SSIS
But it's getting very complicated with all those (15) joins.

Is it maybe a better idea to just write T-SQL code to :
1) Import the XML into a column which is of the type XML and is linked to a XSD-schema.
2) Use this code:

TRUNCATE TABLE XML_Import
INSERT INTO XML_Import(ImportDateTime, XmlData)
SELECT GETDATE(), XmlData 
FROM
(
    SELECT  * 
    FROM    OPENROWSET (BULK 'c:\XML-Data.xml', SINGLE_BLOB) AS XMLDATA
) AS FileImport (XMLDATA)

delete from dbo.UserFlat
INSERT INTO dbo.UserFlat
SELECT
    user.value('(UserIdentifier)', 'varchar(8)') as UserIdentifier,
    user.value('(Emailaddress)', 'varchar(70)') as Emailaddress,
    businessaddress.value('(Fax)', 'varchar(70)') as Fax,
    employment.value('(EmploymentData)', 'varchar(8)') as EmploymentData,
    -- More values here ...
FROM  
    XML_Import CROSS APPLY
    XmlData.nodes('//user') AS User(user) CROSS APPLY
    user.nodes('BusinessAddress') AS BusinessAddress(businessaddress) CROSS APPLY
    user.nodes('Employment') AS Employment(employment)
    -- More 'joins' here ...

to fill the 'UserFlat' table ?
Some disadvantages are that you have to manually type the SQLcode, but the advantage here is that I have more direct control how the elements are processed and converted. But I don't know if there are any performance differences between processing XML in SSIS and processing the XML with T-SQL XML statements.

Note that some other requirements are:

  1. Error handling : in case of an error, an email must be send to a person.
  2. Able to process multiple input files with a specific file name pattern : XML_{date}_{time}.xml
  3. Move the processed XML files to a different folder.

Please advice.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

旧时光的容颜 2024-12-08 07:54:48

根据您提到的要求,我想说您可以使用两全其美的方法(T-SQL 和 SSIS)。

我认为 T-SQL 在加载您在问题中描述的 XML 数据方面提供了更大的灵活性。

有很多不同的方法可以实现这一目标。以下是一个可能的选项:

  1. 创建一个存储过程,它将 XML 文件的路径作为输入参数。

  2. 使用您认为更简单的 T-SQL 方式执行 XML 数据加载操作。

  3. 使用SSIS包执行错误处理、文件处理、归档和发送电子邮件。

  4. 使用 SSIS 中提供的日志记录功能。它只需要简单的配置。以下示例展示了如何在 SSIS 中配置日志记录 如何跟踪 SSIS 数据流任务中成功处理或失败的行的状态?

  5. 示例模拟您的流程将如下面的屏幕截图所示。使用 Foreach 循环容器循环文件。将文件路径作为参数传递给执行 SQL 任务,该任务又会调用您提到的 T-SQL。处理文件后,使用文件系统任务将文件移动到存档文件夹。

  6. SSIS 从文件夹读取多个 xml 文件中使用的示例< /a>
    演示如何使用 Foreach 循环容器循环访问文件。它循环遍历 xml 文件,但使用数据流任务,因为 xml 文件的格式更简单。

  7. 如何使用 SSIS 包发送电子邮件正文中表中的记录? 显示如何使用发送邮件发送电子邮件任务。

  8. 处理文件后如何将文件移动到存档文件夹? 显示如何将文件移动到存档文件夹。

  9. 在 SSIS 中执行文件系统任务后进行分支而不使包失败展示了即使在特定任务失败后如何继续执行包。即使 Foreach 循环失败,这也将帮助您继续执行包,以便您可以发送电子邮件。屏幕截图中的蓝色箭头表示完成上一个任务。

  10. 如何使用 SSIS 包中的 Foreach 循环容器选择最近创建的文件夹? 显示如何执行模式匹配。

希望能给你一个想法。

流

Based on the requirements that you have mentioned, I would say that you can use best of both the worlds (T-SQL & SSIS).

I feel that T-SQL gives more flexibility in loading the XML data that you have described in the question.

There are lot of different ways you can achieve this. Here is one possible option:

  1. Create a Stored Procedure that would take the path of the XML file as input parameter.

  2. Perform your XML data load operation using the T-SQL way which you feel is easier.

  3. Use SSIS package to perform error handling, file processing, archiving and send email.

  4. Use logging feature available in SSIS. It just requires simple configuration. Here is a samples that show how to configure logging in SSIS How to track status of rows successfully processed or failed in SSIS data flow task?

  5. A sample mock up of your flow would be as shown below in the screenshot. Loop the files using Foreach Loop container. Pass the file path as parameter to Execute SQL Task, which in turn would call the T-SQL that you had mentioned. After processing the file, using the File System Task to move the file to an archive folder.

  6. Sample used in SSIS reading multiple xml files from folder
    shows how to loop through files using Foreach loop container. It loops through xml files but uses Data Flow Task because the xml files are in simpler format.

  7. Sample used in How to send the records from a table in an e-mail body using SSIS package? shows how to send e-mail using Send Mail Task.

  8. Sample used in How do I move files to an archive folder after the files have been processed? shows how to move files to an Archive folder.

  9. Sample used in Branching after a file system task in SSIS without failing the package shows how to continue package execution even after a particular task fails. This will help you to proceed with package execution even if Foreach Loop fails so you can send email. Blue arrow in the screenshot indicates on completion of previous task.

  10. Sample used in How do I pick the most recently created folder using Foreach loop container in SSIS package? shows how to perform pattern matching.

Hope that gives you an idea.

Flow

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文