由于更改 8087CW 模式而导致 System.Move 内存损坏 (png +stretchblt)

发布于 2024-08-28 00:42:01 字数 2543 浏览 9 评论 0原文

我有一个奇怪的内存损坏问题。经过几个小时的调试和尝试,我想我找到了一些东西。

例如: 我做了一个简单的字符串赋值:

sTest := 'SET LOCK_TIMEOUT ';

但是,结果有时会变成:

sTest = 'SET LOCK'#0'TIMEOUT '

因此, _ 被 0 字节替换。

(复制很棘手,取决于时间),当时它使用 FPU 堆栈(fild、fistp)进行快速内存复制(在要移动 9 到 32 字节的情况下):

...
@@SmallMove: {9..32 Byte Move}
fild    qword ptr [eax+ecx] {Load Last 8}
fild    qword ptr [eax] {Load First 8}
cmp     ecx, 8
jle     @@Small16
fild    qword ptr [eax+8] {Load Second 8}
cmp     ecx, 16
jle     @@Small24
fild    qword ptr [eax+16] {Load Third 8}
fistp   qword ptr [edx+16] {Save Third 8}
...

我在 System.Move 函数中看到过这种情况发生一次 FPU 视图和 2 个内存调试视图(Delphi -> View -> Debug -> CPU -> Memory)我看到它出了问题...一次...但是无法重现...

今天早上我读到关于 8087CW 模式的一些信息,是的,如果将其更改为 $27FI,则会出现内存损坏!通常是$133F:

$133F 和 $027F 之间的区别在于 $027F 设置 FPU 进行不太精确的计算(限制为 Double 而不是 Extended)和不同的 Infiniti 处理(用于较旧的 FPU,但不用于任何更多)。

好吧,现在我找到了原因,但不知道何时

我通过简单的检查更改了 AsmProfiler 的工作方式(因此所有功能都在输入时进行检查)并离开):

if Get8087CW = $27F then    //normally $1372?
  if MainThreadID = GetCurrentThreadId then  //only check mainthread
    DebugBreak;

我“分析”了一些单元和 dll 以及宾果游戏(参见堆栈):

Windows.StretchBlt(3372289943,0,0,514,345,4211154027,0,0,514,345,13369376)
pngimage.TPNGObject.DrawPartialTrans(4211154027,(0, 0, 514, 345, (0, 0), (514, 345)))
pngimage.TPNGObject.Draw($7FF62450,(0, 0, 514, 345, (0, 0), (514, 345)))
Graphics.TCanvas.StretchDraw((0, 0, 514, 345, (0, 0), (514, 345)),$7FECF3D0)
ExtCtrls.TImage.Paint
Controls.TGraphicControl.WMPaint((15, 4211154027, 0, 0))

所以它发生在 StretchBlt 中...

现在该怎么办? 这是 Windows 的故障还是 bug PNG(包含在 D2007 中)? 或者 System.Move 函数不是故障安全的?

注意:仅仅尝试重现是行不通的:

  Set8087CW($27F);
  sSQL := 'SET LOCK_TIMEOUT ';

它似乎更奇特......但是通过“Get8087CW = $27F”上的 debugbreak 我可以在其他字符串上重现它: FPU 第 1 部分: FPU 第 1 部分 FPU 第 2 部分: FPU 第 2 部分 FPU 第 3 部分: FPU 第 3 部分 FPU 最终:损坏!: FPU Final: Corrupt!

注 2: 也许必须在系统.移动?

I have strange a memory corruption problem. After many hours debugging and trying I think I found something.

For example: I do a simple string assignment:

sTest := 'SET LOCK_TIMEOUT ';

However, the result sometimes becomes:

sTest = 'SET LOCK'#0'TIMEOUT '

So, the _ gets replaced by an 0 byte.

I have seen this happening once (reproducing is tricky, dependent on timing) in the System.Move function, when it uses the FPU stack (fild, fistp) for fast memory copy (in case of 9 till 32 bytes to move):

...
@@SmallMove: {9..32 Byte Move}
fild    qword ptr [eax+ecx] {Load Last 8}
fild    qword ptr [eax] {Load First 8}
cmp     ecx, 8
jle     @@Small16
fild    qword ptr [eax+8] {Load Second 8}
cmp     ecx, 16
jle     @@Small24
fild    qword ptr [eax+16] {Load Third 8}
fistp   qword ptr [edx+16] {Save Third 8}
...

Using the FPU view and 2 memory debug views (Delphi -> View -> Debug -> CPU -> Memory) I saw it going wrong... once... could not reproduce however...

This morning I read something about the 8087CW mode, and yes, if this is changed into $27F I get memory corruption! Normally it is $133F:

The difference between $133F and $027F is that $027F sets up the FPU for doing less precise calculations (limiting to Double in stead of Extended) and different infiniti handling (which was used for older FPU’s, but is not used any more).

Okay, now I found why but not when!

I changed the working of my AsmProfiler with a simple check (so all functions are checked at enter and leave):

if Get8087CW = $27F then    //normally $1372?
  if MainThreadID = GetCurrentThreadId then  //only check mainthread
    DebugBreak;

I "profiled" some units and dll's and bingo (see stack):

Windows.StretchBlt(3372289943,0,0,514,345,4211154027,0,0,514,345,13369376)
pngimage.TPNGObject.DrawPartialTrans(4211154027,(0, 0, 514, 345, (0, 0), (514, 345)))
pngimage.TPNGObject.Draw($7FF62450,(0, 0, 514, 345, (0, 0), (514, 345)))
Graphics.TCanvas.StretchDraw((0, 0, 514, 345, (0, 0), (514, 345)),$7FECF3D0)
ExtCtrls.TImage.Paint
Controls.TGraphicControl.WMPaint((15, 4211154027, 0, 0))

So it is happening in StretchBlt...

What to do now? Is it a fault of Windows, or a bug in PNG (included in D2007)?
Or is the System.Move function not failsafe?

Note: simply trying to reproduce does not work:

  Set8087CW($27F);
  sSQL := 'SET LOCK_TIMEOUT ';

It seems to be more exotic... But by debugbreak on 'Get8087CW = $27F' I could reproduce it on an other string:
FPU part 1:
FPU part 1
FPU part 2:
FPU part 2
FPU part 3:
FPU part 3
FPU Final: corrupt!:
FPU Final: corrupt!

Note 2: Maybe the FPU stack must be cleared in the System.Move?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

爱的十字路口 2024-09-04 00:42:01

我还没有看到这个特殊问题,但如果 FPU 处于不良状态,Move 肯定会变得混乱。即使您没有做任何与网络相关的事情,Cisco 的 VPN 驱动程序也会把事情搞砸。

http://brianorr.blogspot.com/ 2006/11/intel-pentium-d-floating-point-unit.html [已损坏]

https://web.archive.org/web/20160601043520/http://www.dankohn.com/archives/343

http://blog.excastle.com/ 2007/08/28/delphi-bug-of-the-day-fpu-stack-leak/(Ritchie Annand 评论)

在我们的例子中,我们检测到有问题的 VPN 驱动程序,并用 Delphi 替换 Move 和 FillChar 7 版本中,将 IntToStr 替换为 Pascal 版本(Int64 版本使用 FPU),并且由于我们使用 FastMM,我们也禁用其自定义固定大小移动例程,因为它们比 System.Move 更容易受到影响。

I haven't seen this particular issue, but Move can definitely get messed up if the FPU is in a bad state. Cisco's VPN driver can screw things up horribly, even if you're not doing anything network related.

http://brianorr.blogspot.com/2006/11/intel-pentium-d-floating-point-unit.html [broken]

https://web.archive.org/web/20160601043520/http://www.dankohn.com/archives/343

http://blog.excastle.com/2007/08/28/delphi-bug-of-the-day-fpu-stack-leak/ (comments by Ritchie Annand)

In our case we detect the buggy VPN driver and swap out Move and FillChar with the Delphi 7 versions, replace IntToStr with a Pascal version (Int64-version uses the FPU), and, since we're using FastMM, we disable its custom fixed size move routines too, since they're even more susceptible than System.Move.

糖果控 2024-09-04 00:42:01

这可能是您的视频驱动程序中的一个错误,导致在执行 StretchBlt 操作时不保留 8087 控制字。
过去,我在使用某些打印机驱动程序时也看到过类似的行为。他们认为自己拥有 8087 CW,但这是错误的……

请注意,Delphi 中 8087 CW 的默认值似乎是 1372 美元;有关 CW 值的更详细说明,请参阅 这篇文章:它还解释了 Michael Justin 在他的 8087CW 被冲洗时描述的情况。

——杰罗恩

It might be a bug in your video driver that does not preserve the 8087 control word when it performs the StretchBlt operation.
In the past I have seen similar behaviour when using certain printer drivers. They think they own the 8087 CW and are wrong...

Note the default value of the 8087 CW in Delphi seems $1372; for a more detailed explanation of the CW values, see this article: it also explains a situation that Michael Justin described when his 8087CW got hosed.

--jeroen

旧夏天 2024-09-04 00:42:01

对于那些仍然对此感兴趣的人:还有另一个可能的问题原因:

Delphi Rio 仍然附带有损坏的 ASM 版本的 Move

今天我很高兴遇到这个错误,幸运的是我有一个可重现的测试用例。问题是这段代码:

* ***** BEGIN LICENSE BLOCK *****
 *
 * The assembly function Move is licensed under the CodeGear license terms.
 *
 * The initial developer of the original code is Fastcode
 *
 * Portions created by the initial developer are Copyright (C) 2002-2004
 * the initial developer. All Rights Reserved.
 *
 * Contributor(s): John O'Harrow
 *
 * ***** END LICENSE BLOCK ***** *)

// ... some less interesting parts omitted ...

@@LargeMove:
        JNG     @@LargeDone {Count < 0}
        CMP     EAX, EDX
        JA      @@LargeForwardMove

        // the following overlap test is broken
        // when size>uint(destaddr), EDX underflows to $FFxxxxxx, in which case 
        // we jump to @LargeForwardMove even if a backward loop would be appropriate
        // this will effectively shred everything at EDX + size
        SUB     EDX, ECX              // when this underflows ...
        CMP     EAX, EDX              // ... we also get CF=1 here (EDX is usually < $FFxxxxxx)
        LEA     EDX, [EDX+ECX]        // (does not affect flags)
        JNA     @@LargeForwardMove    // ... CF=1 so let's jump into disaster!

        SUB     ECX, 8 {Backward Move}
        PUSH    ECX
        FILD    QWORD PTR [EAX+ECX] {Last 8}
        FILD    QWORD PTR [EAX] {First 8}
        ADD     ECX, EDX
        AND     ECX, -8 {8-Byte Align Writes}
        SUB     ECX, EDX

参考文献

For those still interested in this: There's yet another possible cause of problems:

Delphi Rio still ships with a broken ASM version of Move.

I had the pleasure to run into that bug today, luckily enough I had a reproducible test case. The issue is this piece of code:

* ***** BEGIN LICENSE BLOCK *****
 *
 * The assembly function Move is licensed under the CodeGear license terms.
 *
 * The initial developer of the original code is Fastcode
 *
 * Portions created by the initial developer are Copyright (C) 2002-2004
 * the initial developer. All Rights Reserved.
 *
 * Contributor(s): John O'Harrow
 *
 * ***** END LICENSE BLOCK ***** *)

// ... some less interesting parts omitted ...

@@LargeMove:
        JNG     @@LargeDone {Count < 0}
        CMP     EAX, EDX
        JA      @@LargeForwardMove

        // the following overlap test is broken
        // when size>uint(destaddr), EDX underflows to $FFxxxxxx, in which case 
        // we jump to @LargeForwardMove even if a backward loop would be appropriate
        // this will effectively shred everything at EDX + size
        SUB     EDX, ECX              // when this underflows ...
        CMP     EAX, EDX              // ... we also get CF=1 here (EDX is usually < $FFxxxxxx)
        LEA     EDX, [EDX+ECX]        // (does not affect flags)
        JNA     @@LargeForwardMove    // ... CF=1 so let's jump into disaster!

        SUB     ECX, 8 {Backward Move}
        PUSH    ECX
        FILD    QWORD PTR [EAX+ECX] {Last 8}
        FILD    QWORD PTR [EAX] {First 8}
        ADD     ECX, EDX
        AND     ECX, -8 {8-Byte Align Writes}
        SUB     ECX, EDX

References

尝蛊 2024-09-04 00:42:01

仅供您参考(以防其他人也有同样的问题):我们为客户升级了我们的软件,并且当我们的应用程序启动时,整个触摸屏被锁定!窗户完全被冻住了!必须重新启动电脑(关闭电源)。花了一些时间才找出完全冻结的原因。

幸运的是,我们在 FastMove.LargeSSEMove 中拥有一个(只有 1 个!)AV 堆栈跟踪。我在 fastmove 中禁用了 SSE,问题就消失了。

顺便说一下:触摸屏配备了带有 S3 芯片组的 VIA Nehemiah cpu。

因此,在使用 FPU 时,您不仅会出现内存损坏,还会出现完全冻结!

Just for your information (in case some else has same problem too): we did an upgrade of our software for a customer, and the complete touchscreen locked up when our application was started! Windows was completely frozen! The pc had to be restarted (power off). It took some time to figure out the cause of the complete freeze.

Fortunately we had one (only 1!) stacktrace of an AV in FastMove.LargeSSEMove. I disabled the usage of SSE in fastmove, and the problem is gone.

By the way: touchscreen has an VIA Nehemiah cpu with an S3 chipset.

So not only you can get memory corruptions when using the FPU, but also a complete freeze!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文