如何减少 SSIS ETL 过程的事务日志大小?

发布于 2025-01-06 13:29:14 字数 573 浏览 1 评论 0原文

对于大约 10 GB 的 SQL Server 2008 数据库,我们遇到了事务日志大小增长到大约 70 GB 的问题。这种情况发生在使用 SSIS 2008 运行的 ETL 流程的临时数据库中。该流程大约需要 8 小时才能完成,并且每周运行一次。

70 GB 已经接近系统的物理极限,仅 IO 就会造成性能损失。

我们永远不会使用日志来恢复临时数据库,因为 ETL 将在每次运行时重新开始,并在进程失败时恢复早期的数据库备份。此时,增量 ETL 过程将很难实现,因为配置可能会在运行之间发生变化,从而更改许多或全部计算。

数据库恢复模式设置为简单。可以允许 SISS 对登台数据库进行独占访问。

我们想知道调整 SISS 会带来什么结果。

一些注意事项:

  • SISS 似乎与数据库建立了多个连接。这如何干扰调整包的事务隔离级别和事务的“长度”,即在完成某些步骤后提交?
  • 在单用户模式下运行时进程性能是否会提高? SISS能应付吗?
  • 仅配置包以设置 Transaction = Unsupported 运行是否明智?

关于这个问题有什么经验、想法或建议吗?

We have an issue with a transaction log growing to a size of about 70 GB for a SQL Server 2008 database of about 10 GB. This happens in the staging database of an ETL process run with SSIS 2008. The process takes about 8 hours to complete and is run on a weekly basis.

70 GB is getting close to the physical limits of the system and the I-O alone will have a performance penalty.

We will never make use of the logs for recovery of the staging database, since the ETL will start all over again with each run and restore an earlier database backup on process failures. At this point, an incremental ETL process would be hard to implement, because the configuration can change between runs, changing many or all of the calculations.

The database recovery model is set to simple. SISS can be allowed exclusive access to the staging database.

We wonder what can be expected from tweaking SISS.

Some considerations:

  • Multiple connections seem to made by SISS to the database. How could this interfere with tweaking the transaction isolation levels of packages and the 'length' of transactions, i.e. commit after certain steps are completed?
  • Could the process performance increase when running in single user mode? Could SISS cope with that?
  • Would it be wise to just configure packages to run with setting Transaction = Unsupported?

Any experiences, thoughts or recommendations on this issue?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

真心难拥有 2025-01-13 13:29:14

如果此过程每周运行一次,我假设数据在进入系统时不需要进行处理。为了澄清(如果我错了,请纠正我),您每周都会将一些数据加载到数据库中。

如果日志那么大,我会假设您没有进行批量插入。如果没有,那就去做吧!这应该会加快进程并减少日志大小。

If this process is being run on a weekly basis, I'm assuming that data does not need to be processed as it enters the system. To clarify (and correct me if I'm wrong), you've got some set of data that you're loading into the database once per week.

If the log is that big, I would assume you're not doing bulk inserts. If not, then do it! That should speed up the process and reduce the log size.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文