SQL Server 存储过程转换为 SSIS 包

发布于 2024-11-10 13:14:56 字数 232 浏览 3 评论 0原文

问题:目前我们有大量的存储过程(很长,长达 10,000 行),这些存储过程是由不同的开发人员在过去 10 年中根据不同的需求编写的。现在管理那些复杂/长的存储过程(没有适当的文档)变得很困难。

我们计划将这些存储过程移至 SSIS ETL 包中。

有人做过这事吗?如果是,应该采取什么方法。

如果有人能够提供有关将存储过程转换为 SSIS ETL 包的方法的建议,我将不胜感激。

谢谢

Problem: currently we have numerous stored procedures (very long up to 10,000 lines) which were written by various developers for various requirements in last 10 years. It has become hard now to manage those complex/long stored procedures (with no proper documentation).

We plan to move those stored procedure into SSIS ETL package.

Has anybody done this is past? If yes, what approach should one take.

Appreciate if anybody could provide advise on approach to convert stored procedure into SSIS ETL Packages.

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

瑾夏年华 2024-11-17 13:14:56

我以前就这样做过,对我的团队来说行之有效的是从原始源开始逐步重构,然后迭代重构工作。

第一步是尝试将存储过程逻辑模块化为我们链接在一起的执行 SQL 任务。每项任务都经过测试和批准,然后我们将进行整合并确保新流程与遗留流程的结果相匹配。

之后,我们可以在团队中划分各个执行 SQL 任务,并对是否可以进一步将执行 SQL 任务中的 SQL 重构为本机 SSIS 任务进行负载平衡分析。

每个重构都经过单独的单元测试,然后进行集成测试,以确保整个流程输出的行为仍然类似于遗留过程。

I've done this before, and what worked well for my team was to refactor incrementally, starting with the original source, and then iterate the refactoring effort.

The first step was to attempt to modularize the stored procedure logic into Execute SQL tasks that we chained together. Each task was tested and approved, then we'd integrate and ensure that the new process matched the results of the legacy procedures.

After this point, we could divide the individual Execute SQL tasks across the team, and load-balance the analysis of whether we could further refactor the SQL within the Execute SQL tasks to native SSIS tasks.

Each refactoring was individually unit tested and then integration tested to ensure that the overall process output still behaved like the legacy procedures.

淡笑忘祈一世凡恋 2024-11-17 13:14:56

我建议执行以下步骤:

  1. 分析存储过程以识别源和目标列表。例如:如果存储过程dbo.TransferOrders 将数据从表dbo.Order 移动到dbo.OrderHistory。那么您的源将为dbo.Order,目标将为dbo.OrderHistory

  2. 列出源和目标后,尝试根据您的偏好按源/目标对存储过程进行分组。

  3. 尝试查明存储过程中是否发生任何数据转换。 SSIS 中提供了很好的数据转换任务。您可以评估其中一些功能并将其从存储过程移至 SSIS。由于 SSIS 是一种工作流工具,因此我觉得理解包内部的内容比滚动许多行代码来理解功能更容易。但是,那只是我。偏好因人而异。

  4. 尝试识别存储过程中的依赖关系并准备层次结构。这将有助于将任务按适当的顺序放置在包内。

  5. 如果您有名为 dbo.Table1 的表,其中填充了 5 个不同的表。我建议将它们放在一个包中。即使此数据填充由 5 个不同的存储过程执行,您也不需要使用 5 个包。不过,这又取决于您的业务场景。

  6. SSIS 项目解决方案可以包含多个包并重用数据源。您可以使用控制流任务上可用的执行 SQL 任务来运行现有查询,但我建议您还看一下 SSIS 中可用的一些不错的转换任务。我已经在我的项目中使用了它们,并且它们对于 ETL 操作运行良好。

这些步骤可以通过一次查看一个存储过程来完成。您不必一次完成所有这些。

请查看我在其他 Stack Overflow 问题中给出的一些示例。这些应该可以帮助您了解使用 SSIS 可以实现什么目标。

复制数据一个 SQL 表到另一个表

SSIS 中提供的日志记录功能

<一个href="https://stackoverflow.com/questions/6061075/sql-integration-services-to-load-tab-delimited-file/6159600#6159600">将包含 100 万行的平面文件加载到 SQL 表中使用 SSIS

希望有帮助。

I would suggest the following steps:

  1. Analyze the stored procedures to identify the list of sources and destinations. For example: If the stored procedure dbo.TransferOrders moves data from table dbo.Order to dbo.OrderHistory. Then your source will be dbo.Order and destination will be dbo.OrderHistory.

  2. After you list out the sources and destinations, try to group the stored procedures according to your preference either by source/destination.

  3. Try to find out if there are any data transformations happening within the stored procedures. There are good data transformation tasks available within SSIS. You can evaluate and move some of those functionalities from stored procedures to SSIS. Since SSIS is a workflow kind of tool, I feel that it is easier to understand what is going inside the package than having to scroll through many lines of code to understand the functionality. But, that's just me. Preferences differ from person to person.

  4. Try to identify the dependencies within stored procedures and prepare a hierarchy. This will help in placing the tasks inside the package in appropriate order.

  5. If you have table named dbo.Table1 populating 5 different tables. I would recommend having them in a single package. Even if this data population being carried out by 5 different stored procedures, you don't need to go for 5 packages. Still, this again depends on your business scenario.

  6. SSIS project solution can have multiple packages within them and re-use data sources. You can use Execute SQL task available on the Control Flow task to run your existing queries but I would recommend that you also take a look at some of the nice transformation tasks available in SSIS. I have used them in my project and they function well for ETL operations.

These steps can be done by looking into one stored procedure at a time. You don't have to go through all of them at once.

Please have a look at some of the examples that I have given in other Stack Overflow questions. These should help you give an idea of what you can achieve with SSIS.

Copying data from one SQL table to another

Logging feature available in SSIS

Loading a flat file with 1 million rows into SQL tables using SSIS

Hope that helps.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文