SQL Server 2005 事务日志条目:LOP_Format_Page
我正在调查与 ETL 过程中的大型日志扩展相关的问题,即使数据库设置为批量记录模式(并且它不是以伪简单运行,而是真正的批量记录)
使用 ::fn_dblog(null,null)函数来检查事务日志操作和操作的上下文,日志扩展几乎完全取决于 LCX_Heap 上下文上 LOP_FORMAT_PAGE 操作的日志记录。 (97%的扩展是该操作,对于单个数据加载,在日志中出现超过600k次。)
问题是,lop_format_page正在做什么/记录SQL已经做了什么?
鉴于此,我应该能够反转逻辑并理解导致这种情况的因果链是什么,并能够在适当的情况下更改 ETL。
我没想到很多人都遇到过这个,关于操作和上下文的可用细节水平很少甚至没有。
I am investigating an issue relating to a large log expansion during an ETL process, even though the database is set in bulk logged mode (and it is not running in psuedo simple but truely bulk logged)
Using the ::fn_dblog(null,null) function to examine the transaction log operations and the context of the operation, the log expansion is pretty much entirely down to the logging of a LOP_FORMAT_PAGE operation, on a LCX_Heap context. (97% of the expansion is that operation, appearing in the log over 600k times for a single data load.)
The question is, what is the lop_format_page doing / recording that SQL has done?
Given that, I should be able to reverse the logic and understand what the cause / effect chain is that results in this and be able to alter the ETL if appropriate.
I'm not expecting many people have come across this one, the level of available detail on the operations and context is minimal to none.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
你是对的,这是非常薄弱的(又名没有!)记录。我在日志内部做了一些探索,并做了很多的日志缩减工作(主要是通过确保批量插入实际上是批量完成的!)。所以我知道追踪起来可能很困难。
我最好的猜测是,在上下文中使用了 LOP_FORMAT_PAGE ,它正在清除一个新页面——例如,一旦该页面已满并且需要创建另一个条目,则拆分索引页面时。因此,如果这个假设是正确的,您可能需要找出导致分配一大堆新页面的原因。
当您看到日志扩展时,您知道 ETL 中正在进行哪些操作吗?理解这个上下文会很有帮助——如果可能的话,请将该信息添加到您的问题中。
另外,您是否能够在测试环境中运行和更改您的 ETL 代码?与其弄清楚这个难以理解的日志记录定义,不如通过运行 ETL 同时注释掉某些步骤(或限制受影响的行数)然后查看哪些更改使问题消失来隔离问题可能更容易。
You're correct that this is very thinly (AKA not!) documented. I've done a little poking around inside logs and have done a lot of log-reduction work (mostly by ensuring bulk inserts were actually being done in bulk!). So I know this can be challenging to track down.
My best guess, having seen LOP_FORMAT_PAGE used in context, is that it's clearing out a new page-- for example when splitting an index page once that page is full and another entry needs to be created. So, if this assumption is correct, you may want to track down what may be causing a whole bunch of new pages to get allocated.
Do you know which operations are going on in the ETL while you're seeing the log expansion? It would be helpful to understand this context-- please add that info to your question if possible.
Also, are you able to run and vary your ETL code in a test environment? Instead of figuring out this inscrutable log record definition, it may be easier to isolate the problem by running your ETL while commenting out some steps (or limiting the number of rows affected) and then seeing which change makes the problem go away.
我想你和贾斯汀已经找到了答案,但事情并没有那么复杂。
ETL 过程(提取、转换、加载)将数据加载到数据库中。当然,当页面填满时,需要在堆上分配新的页面。
I think you and Justin are onto the answer, but it is not all that complicated.
The ETL process (Extract, transform, load) is loading data into the db. Naturally, as pages fill up, new ones need to be allocated on the heap.
我认为
LOP_FORMAT_PAGE
也只是格式化页面。但如果数组计数为 1,则它包含整页数据,或者包含数据(标头加记录)的页面的一部分以及从第二个数组中页面末尾到记录的偏移量。I thought that
LOP_FORMAT_PAGE
only formatting page too. But it contains either full page data if count of arrays is 1 or part of page with data (header plus records) and offsets to records from the end of page in second array.