GPU 上的高精度数学

发布于 2024-08-05 14:36:25 字数 97 浏览 5 评论 0原文

我有兴趣使用 HLSL 在 GPU 上实现算法,但我主要关心的问题之一是我想要可变的精度级别。是否有可以在 GPU 上实现的模拟 64 位及更高精度的技术。

谢谢!

I'm interested in implementing an algorithm on the GPU using HLSL, but one of my main concerns is that I would like a variable level of precision. Are there techniques out there to emulate 64bit precision and higher that could be implemented on the GPU.

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

拥有 2024-08-12 14:36:25

GPU 刚刚开始在硬件中支持双精度,尽管在不久的将来它的速度仍将比单精度慢得多。多年来已经开发了多种技术来使用由多个浮点组成的表示来合成更高精度的浮点,无论精度如何,都具有快速的硬件支持,但开销相当大。 IIRC,crlibm 手册对其中一些技术进行了很好的讨论,包括错误分析和伪代码(CRLIBM 使用它们将数字表示为多个双精度值,但相同的技术可以用于单精度值)

无需了解更多关于你想要做什么,很难给出更好的答案。对于某些算法来说,只有一小部分计算需要较高的精度;如果您遇到这种情况,您可能会在 GPU 上获得不错的性能,尽管代码不一定非常漂亮或易于使用。如果您在整个算法中普遍需要高精度,那么 GPU 目前对您来说可能不是一个有吸引力的选择。

最后,为什么选择 HLSL 而不是 CUDA 或 OpenCL 等面向计算的语言?

GPUs are just beginning to support double precision in hardware, though it will continue to be much slower than single precision in the near future. There are a wide variety of techniques that have been developed over the years to synthesize higher-accuracy floating point using a representation composed of multiple floats in whatever precision has fast hardware support, but the overhead is pretty substantial. IIRC, the crlibm manual has a pretty good discussion of some of these techniques, with error analysis and pseudocode (CRLIBM uses them to represent numbers as more than one double-precision value, but the same techniques can be used with single)

Without knowing more about what you're trying to do, it's hard to give a better answer. For some algorithms, only one small part of the computation needs high accuracy; if you're in a case like that, it might be possible for you to get decent performance on the GPU, though the code won't necessarily be very pretty or easy to work with. If you need high precision pervasively throughout your algorithm, then the GPU probably isn't an attractive option for you at the moment.

Finally, why HLSL and not a compute-oriented language like CUDA or OpenCL?

蹲墙角沉默 2024-08-12 14:36:25

使用两个浮点数(即单精度值),可以实现大约 56 位的精度。这接近双精度数的精度,但是您可以为此“双精度单”数据类型实现的许多操作都很慢,并且不如使用双精度数精确。然而,对于简单的算术运算,它们通常就足够了。

这篇论文讨论了一些关于这个想法和描述如何实现乘法运算。有关您可以执行的操作以及如何实施的更完整列表,请查看 DSFUN90 包 此处。该包是用 Fortran 90 编写的,但可以转换为任何具有单精度数字的内容。但请注意,您必须获得他们的库许可才能将其用于商业目的。我相信 Mersenne-Twister CUDA 演示应用程序也具有加法和乘法运算的实现。

Using two floats (i.e. single precision values), you can achieve about 56-bits of precision. This approaches the precision of a double, but many of the operations you can implement for this "double single" data type are slow and are less precise than using doubles. However, for simple arithmetic operations, they are usually sufficient.

This paper talks a bit about the idea and describes how to implement the multiplication operation. For a more complete list of operations you can perform and how to implement them, check out the DSFUN90 package here. The package is written in Fortran 90, but can be translated to anything that has single precision numbers. Be aware though that you must license library from them to use it for commercial purposes. I believe the Mersenne-Twister CUDA demo application also has implementations for addition and multiplication operations.

晒暮凉 2024-08-12 14:36:25

这是一个稍微偏离主题的答案,但如果您想了解将某些操作切换为单精度算术将如何影响您的问题,您应该考虑使用区间算术来凭经验测量当您以各种方式混合精度时的不确定性边界。 Boost 有一个区间算术库,我曾经用它来检测现有的 C++ 科学代码:它非常容易使用。

但请注意:区间算术是出了名的悲观:即它有时会夸大界限。仿射算术应该更好,但我从未找到可用的库。

This is a slightly off-topic answer, but if you want to see how your problem is going to be impacted by switching some operations to single-precision arithmetic, you should think about using interval arithmetic to empirically measure the uncertainty boundaries when you mix precision in various ways. Boost has an interval arithmetic library that I once used to instrument an existing C++ scientific code: it was quite easy to use.

But be warned: interval arithmetic is notoriously pessimistic: i.e. it sometimes exaggerates bounds. Affine arithmetic is supposed to be better, but I never found a usable library for that.

顾冷 2024-08-12 14:36:25

ATI 的 Stream SDK 支持某些本机双精度,但它不是 HLSL。

问题在于:

  • 并非所有 GPU 都具有双精度硬件,只有 HD 4870 等高端卡才
  • 支持所有双精度运算。例如,没有除法指令。

OpenCL 将支持双精度作为扩展,但这仍处于测试阶段。

ATI's Stream SDK supports some native double precision, but it's not HLSL.

The catches are that:

  • not all GPUs have double precision hardware, only the higher-end cards like HD 4870
  • not all double precision operations are available. For example, no divide instruction.

OpenCL will support double precision as an extension, but that's still in beta.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文