为什么 double 没有类似 C4738 的警告？

发布于 2024-12-29 06:42:02 字数 370 浏览 1 评论 0原文

Visual C++ 可能发出 C4738 警告 :

将 32 位浮点结果存储在内存中，可能会降低性能

当 32 位浮点将存储在内存中而不是存储在寄存器中时，可能会导致性能损失。

描述进一步表明使用 double 可以解决该问题。我不明白为什么后者是正确的。

为什么在内存中存储 float 会导致性能损失，而存储 double 却不会？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

你是我的挚爱i 2025-01-05 06:42:02

该警告结合了两个问题：

浮点数需要存储在内存中而不是寄存器中，这会降低性能（因为内存比寄存器慢得多）
浮点数将被舍入（因为寄存器始终有 64 或 80 位，但在内存中只有浮点数有 32 位）。

使用 double 解决了第二个问题（至少部分地，64 位仍然不如 80 位精确），但对可能的性能损失没有影响。这就是为什么警告描述提到了两种补救措施：

要解决此警告并避免舍入，请使用 /fp:fast 进行编译或
使用双精度数而不是浮点数。
要解决此警告并避免寄存器用完，请更改
计算顺序并修改内联的使用

回复收藏 0 原文

疧_╮線 2025-01-05 06:42:02

虽然我不能 100% 确定原因，但这是我的猜测。

当未启用 x86 和 SSE2 上的编译时，编译器必须对所有浮点寄存器使用 x87 FP 堆栈。在 MSVC 上，FP 模式默认设置为 53 位精度舍入。（我想。我对此不是 100% 确定。）

因此，在 FP 堆栈上完成的所有操作都是双精度的。

但是，当将某些内容强制转换为浮点型时，精度需要四舍五入为单精度。执行此操作的唯一方法是通过 4 字节内存操作数上的 fstp 指令将其存储到内存中 - 并重新加载它。

让我们看一下 C4738 警告页面上的示例< /a> 您链接到：

float func(float f)
{
    return f;
}

int main()
{
    extern float f, f1, f2;
    double d = 0.0;

    f1 = func(d);
    f2 = (float) d;
    f = f1 + f2;   // C4738
    printf_s("%f\n", f);
}

当您调用 func() 时，d 可能存储在 x87 寄存器中。但是，对 func() 的调用需要将精度降低到单精度。这将导致 d 被舍入/存储到内存中。然后在 f = f1 + f2; 行上重新加载并重新提升为双精度。

但是，如果您完全使用 double ，编译器可以将 d 保留在寄存器中 - 从而绕过往返内存的开销。

至于为什么它会让你用完寄存器......我不知道。程序的语义可能会导致具有相同值的双精度和单精度值 - 在这种情况下，需要额外的寄存器。

While I'm not 100% sure of the cause, here's my guess.

When compiling on x86 and SSE2 is not enabled, the compiler must use the x87 FP stack for all floating-point registers. On MSVC, the FP-mode, by default, set to the 53-bit precision rounding. (I think. I'm not 100% sure on this.)

Therefore, all operations done on the FP-stack is at double-precision.

However, when something is cast down to a float, the precision needs to be rounded to single-precision. The only way to do this is to store it to memory via the fstp instruction over a 4-byte memory operand - and reload it.

Let's look at the example on the C4738 warning page you linked to:

float func(float f)
{
    return f;
}

int main()
{
    extern float f, f1, f2;
    double d = 0.0;

    f1 = func(d);
    f2 = (float) d;
    f = f1 + f2;   // C4738
    printf_s("%f\n", f);
}

When you call func(), d is probably stored in an x87 register. However, the call to func() requires that the precision be lowered to single-precision. This will cause d to be rounded/stored to memory. Then reloaded and re-promoted to double-precision on the line f = f1 + f2;.

However, if you use double the whole way, the compiler can keep d in register - thus bypassing the overhead of going to and from memory.

As for why it could make you run out of registers... I have no idea. It's possible that the semantics of the program may result in having both double-precision and single-precision values of the same value - which, in this case, require an extra register.

回复收藏 0 原文

~没有更多了~