修剪延迟空闲队列时堆损坏
我目前正在尝试追踪代码库中堆损坏的根源,当打开全页堆跟踪时,堆损坏的根源不会出现(因此只有正常的页面跟踪)。
我正在使用应用程序验证程序来破坏损坏,并获得一个不太有用的停止代码 00000008:
APPLICATION_VERIFIER_HEAPS_CORRUPTED_HEAP_BLOCK (8)
堆块损坏。
如果堆块中的损坏无法归入更具体的类别,则这是一个通用错误。==========================================
VERIFIER STOP 00000008:pid 0xD30:损坏的堆块。00000000:调用中使用的堆句柄。
0861C000:操作涉及的堆块。
0000043C:堆块的大小。
00000000:保留==========================================
我不得不修剪报告是为了保护无辜者,但请耐心听我说。调用堆栈显示:
1000c540 00000008 00000000 vrfcore!VerifierStopMessageEx+0x543
00000008 7c969624 00000000 vrfcore!VfCoreRedirectedStopMessage+0x81
00000000 00000009 0861c000 ntdll!RtlpDphReportCorruptedBlock+0x101
04a680ee 01001002 03ce1000 ntdll!RtlpDphTrimDelayedFreeQueue+0x84
03ce1000 01001002 04a680ee ntdll!RtlpDphNormalHeapFree+0xc0
03ce0000 01001002 137a0040 ntdll!RtlpDebugPageHeapFree+0x79
03ce0000 01001002 137a0040 ntdll!RtlDebugFreeHeap+0x2c
03ce0000 01001002 137a0040 ntdll!RtlFreeHeapSlowly+0x37
03ce0000 00000000 137a0040 ntdll!RtlFreeHeap+0xf9
137a0040 137a0040 030dfe61 msvcrt!free+0xc3
现在最初,我将注意力集中在对 free() 的调用上,假设我试图释放的内存是堆损坏的罪魁祸首。情况可能仍然如此,但我不再相信了。当我单步执行删除调用时,观察 0x137a0040
,内存似乎已通过调用 RtlpDphNormalHeapFree()
正确释放。我总结说,它被正确释放,因为内存从 0x137a0040
到它的上限大约 76mb 之后仅由 f0
组成,定义为 此处作为释放的内存。
因此,我的注意力转向了调用 RtlpDphReportCorruptedBlock()
、RtlpDphTrimDelayedFreeQueue()
之前的调用。传递给 RtlpDphReportCorruptedBlock() 的参数将向我表明(只是猜测,我找不到有关这些函数声明的任何提示)是损坏的块。对该块的调查显示以下内容:
0861c000 f0 f0 f0 f0 4f f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 ..O.........
为什么第 5 个字节是 4f
,而所有其他字节都是 f0
(已释放)? RtlpDphTrimDelayedFreeQueue()
的作用是什么?问题是(如果这是问题)该函数正在尝试释放显然已经释放的内存,或者该函数是否期望该内存已经空闲,并且在遇到第 5 个字节时丢失绘图?
(第 5 个字节是唯一的奇数,0x0861c000
到 0x0861c43c
是 f0
)
不幸的是,虽然我可以 100% 重现堆损坏时间,每次我在其上放置数据断点时,地址似乎都会改变。
我在 Windows XP SP3 上运行,应用程序是用 VC++6 编写的
有什么想法吗?
I'm currently attempting to track down the source of heap corruption in our code base, which doesn't present itself when full page heap tracking is turned on (so only normal page tracking).
I'm using Application Verifier to break on the corruption, and get a not-so-helpful stop code of 00000008:
APPLICATION_VERIFIER_HEAPS_CORRUPTED_HEAP_BLOCK (8)
Corrupted heap block.
This is a generic error issued if the corruption in the heap block cannot be placed in a more specific category.=======================================
VERIFIER STOP 00000008: pid 0xD30: Corrupted heap block.00000000 : Heap handle used in the call.
0861C000 : Heap block involved in the operation.
0000043C : Size of the heap block.
00000000 : Reserved=======================================
I've had to trim down the report to protect the innocent, but bear with me. The callstack shows:
1000c540 00000008 00000000 vrfcore!VerifierStopMessageEx+0x543
00000008 7c969624 00000000 vrfcore!VfCoreRedirectedStopMessage+0x81
00000000 00000009 0861c000 ntdll!RtlpDphReportCorruptedBlock+0x101
04a680ee 01001002 03ce1000 ntdll!RtlpDphTrimDelayedFreeQueue+0x84
03ce1000 01001002 04a680ee ntdll!RtlpDphNormalHeapFree+0xc0
03ce0000 01001002 137a0040 ntdll!RtlpDebugPageHeapFree+0x79
03ce0000 01001002 137a0040 ntdll!RtlDebugFreeHeap+0x2c
03ce0000 01001002 137a0040 ntdll!RtlFreeHeapSlowly+0x37
03ce0000 00000000 137a0040 ntdll!RtlFreeHeap+0xf9
137a0040 137a0040 030dfe61 msvcrt!free+0xc3
Now initially, I was focusing my attention on the call to free()
, assuming that the memory I was trying to free was the culprit of the heap corruption. This may still be the case, but i'm no longer convinced. Watching 0x137a0040
as I step through the delete call, the memory seems to be properly freed by the call to RtlpDphNormalHeapFree()
. I'm summising that it is freed properly as the memory from 0x137a0040
to it's upper bound some 76mb later consists solely of f0
, defined here as free'd memory.
So my attention turns towards the call immediately before the call to RtlpDphReportCorruptedBlock()
, RtlpDphTrimDelayedFreeQueue()
. The arguments passed to RtlpDphReportCorruptedBlock()
would indicate to me (just a guess, I can't find any hints as to the declarations of these functions) to be the block that is corrupt. Investigation of this block displays the following:
0861c000 f0 f0 f0 f0 4f f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 f0 ....O..............
Why is this 5th byte 4f
, while all the others are f0
(already freed)? What does RtlpDphTrimDelayedFreeQueue()
do? Is the issue (if this is the issue) that this function is trying to free what is obviously already freed memory, or does this function expect that this memory is already free, and is losing the plot when it encounters that 5th byte?
(The 5th byte is the only odd one out, 0x0861c000
to 0x0861c43c
is f0
)
Unfortunately, while I can reproduce the heap corruption 100% of the time, the address seems to change every time I place a data breakpoint on it.
I'm running on Windows XP SP3, and the application is written in VC++6
Any ideas?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这表明您在释放块后修改了它 - 可能是从不同的线程,或者因为某些东西仍然有指向它的指针。 (当您释放它时,运行时将其设置为所有 F0,保留它一段时间,然后检查它是否仍然都是 F0;事实并非如此,因此在释放后它一定已被修改。)
如果损坏是在块中的恒定偏移处,您可以在调用
free()
时发生变化的位置上放置一个断点。This suggests that you have modified the block after you freed it - perhaps from a different thread, or because something still has a pointer to it. (When you free it the runtime sets it to all F0, holds on to it for a while, then checks that it is still all F0; it isn't, so it must have been modified after the free.)
If the corruption is at a constant offset into the block you could place a breakpoint on that location changing at the point of the call to
free()
.C 还是 C++?
如果是 C++ 也许你可以覆盖 new &删除并自行查找。只是永远不要真正释放内存,而是放入你的存储库中。在内存前后分配毒药字段,并在内存在您的银行中时将毒药放入内存中,并始终检查该毒药。
如果是 C,你也许可以用 #define malloc 做类似的事情。我还会搜索 VC6 是否允许您放入处理程序而不是 malloc 和 free。
C or C++ ?
If it is C++ maybe you can override new & delete and find it yourself. Just never actually deallocate memory, put in your bank instead. Allocate memory with poison fields before and after and put poison on memory when it is in your bank and check that poison all the time.
If it is C you can maybe do something similar with #define malloc. I would also search if VC6 allows you to put in your handlers instead of malloc and free.
看起来您正在处理堆损坏,并且几乎可以肯定损坏发生在您发布的调用堆栈实际崩溃之前的某个时间。
Rtl...()
函数不会导致损坏,它们只是强制检测到它。此 MSDN 消息描述了与您类似的问题以及调试它的几种方法。还有这篇 MS-KB 文章,其中描述了 VC6 中的堆损坏。这两个链接(以及我发现的其他一些链接)都提到了多线程,如果您正在使用它,需要检查它。
还有来自 MS 的 PageHeap 应用程序,尽管它可能与应用程序验证程序执行相同的操作。
It looks like you are dealing with a heap corruption and it is almost certain that the corruption happened sometime before the actual crash with the call stack you posted. The
Rtl...()
functions aren't causing the corruption, they are just forcing it to be detected.This MSDN message describes a similar issue to yours and a few ways to debug it. There is also this MS-KB article which describes heap corruption in VC6. Both these links (and a few others I've found) mention multi-threading which is something to check if you are using it.
There is also the PageHeap application from MS, although it may do the same thing as Application Verifier.