检测浮点软件仿真

发布于 2024-11-06 04:40:20 字数 226 浏览 0 评论 0原文

我正在开发一个应用程序,其中运行速度比精度更重要。数字运算涉及浮点运算,我担心 double 和/或 long double 是在软件中处理的,而不是在处理器上本地处理的(这在32 位架构对吧?)。我想在硬件支持下使用最高精度进行条件编译,但我还没有找到一种快速简便的方法来检测软件模拟。我在 GNU/Linux 上使用 g++,我不关心可移植性。它在 x86 架构上运行,因此我假设 float 始终是本机的。

I'm working on an application where runtime speed is more important than precision. The number crunching involves floating point arithmetic and I'm concerned about double and/or long double being handled in software instead of natively on the processor (this is always true on a 32-bit arch right?). I would like to conditionally compile using the highest precision with hardware support, but I haven't found a quick and easy way to detect software emulation. I'm using g++ on GNU/Linux and I'm not concerned about portability. It's running on x86 arch, so I'm assuming that float is always native.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

記憶穿過時間隧道 2024-11-13 04:40:20

现代 x86 上的浮点单元 (FPU) 本质上是双精度型(事实上,它甚至比双精度型更大),而不是浮点型(32 位中的“32”描述的是整数寄存器宽度,而不是浮点宽度)。但是,如果您的代码利用向量化 SSE 指令(并行执行 4 个单精度操作或 2 个双精度操作),则情况并非如此。

如果没有,那么将应用程序从浮动切换到双倍所带来的主要速度影响将在于增加的内存带宽。

The Floating-point unit (FPU) on modern x86 is natively double (in fact, it's even bigger than double), not float (the "32" in 32-bit describes the integer register widths, not the floating-point width). This is not true, however, if your code is taking advantage of vectorized SSE instructions, which do either 4 single or 2 double operations in parallel.

If not, then your main speed hit by switching your app from float to double will be in the increased memory bandwidth.

自找没趣 2024-11-13 04:40:20

(这在 32 位架构上总是正确的,对吗?)

不。常见的 CPU 具有用于 double 的专用硬件(在某些情况下也有 long double)。老实说,如果性能是一个问题,那么您应该了解您的 CPU。查看 CPU 手册,找出每种数据类型的性能损失是什么。

即使在缺乏“适当的”double 支持的 CPU 上,它仍然没有在软件中模拟。 Cell CPU(以 Playstation 3 闻名)只是将双精度值通过 FPU 两次,因此它比浮点计算成本高得多,但它不是软件模拟。您仍然有用于双重处理的专用指令。它们只是比等效的 float 指令效率低。

除非您的目标是 20 年历史的 CPU 或小型、有限的嵌入式处理器,否则浮点指令将在硬件中处理,尽管并非所有架构都能同样有效地处理每种数据类型

(this is always true on a 32-bit arch right?)

No. Common CPU's have dedicated hardware for double (and in some cases long double as well). And honestly, if performance is a concern, then you should know your CPU. Hit the CPU manuals, and figure out what the performance penalty for each datatype is.

Even on CPUs that lack "proper" double support, it still isn't emulated in software. The Cell CPU (of Playstation 3 fame) simply passes a double twice through the FPU, so it's a lot costlier than a float computation, but it's not software emulation. You still have dedicated instructions for double processing. They're just less efficient than the equivalent float instructions.

Unless you either target 20-year-old CPU's, or small, limited embedded processors, floating-point instructions will be handled in hardware, although not all architectures handle every datatype equally efficiently

我是有多爱你 2024-11-13 04:40:20

x86 在硬件中实现了floatdouble 等功能,并且已经这样做了很长时间。许多现代 32 位程序都采用 SSE2 支持,因为它已经存在了好几年,并且可以依赖于消费芯片上的存在。

x86 does float, double, and more in hardware, and has done for a long time. Many modern 32bit programs assume SSE2 support, as that's been around for several years now and can be depended on to be present on a consumer chip.

假面具 2024-11-13 04:40:20

在 x86 上,硬件通常在内部使用 80 位,这对于 double 来说绰绰有余。

您确定性能确实是一个问题(通过分析代码)还是只是猜测它可能不受支持?

On x86, the hardware typically uses 80 bits internally, which is more than enough for double.

Are you sure that performance is a real concern (from profiling the code) or just guessing that it may not be supported?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文