通过 .net 在 fpu 硬件中实现了哪些数学方法?
有谁知道.net 处理器的硬件实现了哪些数学方法?例如,我有一个大量使用 atan 的算法。我可以轻松地为此编写一个查找表,但如果 math.net 使用 fpu 或其他硬件扩展来实现此功能,则不值得。
Does anyone know what math methods are implemnted by the hardware of the processor for .net? For example, I have an algorithm that makes a lot of use of atan. I can easily write a lookup table for this, but if math.net implements this using an fpu or other hardware extensions, it's not going to be worth it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
为什么不直接根据 .net 提供的 atan() 来对查找表方法进行基准测试。然后您将能够清楚地知道使用查找表确实会产生多少速度差异。
有了这个,您就不需要知道底层虚拟机如何做事来确定最快的方法。您甚至可以量化加速效果。
Why not just benchmark your look-up table approach against the atan() provided with .net. Then you'll be able to clearly tell how much of a speed difference using a look-up table really makes.
Armed with that, you won't need to know how the underlying VM does things in order to identify the fastest method. You'll even be able to quantify the speedup.
是否使用 x87 硬件指令实现这些方法不是问题,因为硬件超越函数指令很慢。
“Intel 64 和 IA-32 架构优化参考手册”(在此处下载)将
fpatan
列为在最新硬件上具有 150-300 个周期的延迟。编写良好的软件实现可以在更短的时间内提供完全准确的双精度结果 - 事实上,高质量的数学库就可以做到这一点。Whether or not the methods are implemented using the x87 hardware instructions isn't the issue, because the hardware transcendental function instructions are slow.
The "Intel 64 and IA-32 Architectures Optimization Reference Manual" (download here) lists
fpatan
as having a latency of 150-300 cycles on recent hardware. A well written software implementation can deliver a full accuracy double-precision result in substantially less time -- indeed, high-quality math libraries do just that.根据此博客,Microsoft 的 JIT 编译器确实利用 x86 平台上的 FPU 指令:
http://blogs.msdn.com/davidnotario/archive/2004/10/26/247792.aspx
这是一件非常基本的事情,因为 FPU 有十多年来一直是 x86 CPU 的标准配置。
According to this blog, Microsoft's JIT compiler does take advantage of FPU instructions on the x86 platform:
http://blogs.msdn.com/davidnotario/archive/2004/10/26/247792.aspx
It's a pretty elementary thing to do, since FPUs have been standard on x86 CPUs for over a decade now.
对于小型基准测试应用程序,即使使用 fpu,Math.Atan2 的性能也不如替代近似函数。
在我的简单基准测试中,Math.Atan2 循环需要 8 秒,而近似版本需要 5.5 秒。
With a small benchmark application, it appears that even if the fpu is used, the performance of Math.Atan2 is not as good as an alternative approximisation function.
In my simple benchmark, the Math.Atan2 loop is taking 8 seconds, whilst the approximate version is taking 5.5 seconds.