平方根的硬件实现?
我正在尝试寻找更多关于高效平方根算法的信息,这些算法最有可能在 FPGA 上实现。已经找到了很多算法,但是哪一个是来自 Intel 或 AMD 的? 我所说的高效是指它们要么非常快,要么不需要太多内存。
编辑:我可能应该提到,问题通常是浮点数,因为大多数硬件都实现 IEEE 754 标准,其中数字表示为:1 个符号位、8 位偏置指数和 23 位尾数。
谢谢!
I'm trying to find a little bit more information for efficient square root algorithms which are most likely implemented on FPGA. A lot of algorithms are found already but which one are for example from Intel or AMD?
By efficient I mean they are either really fast or they don't need much memory.
EDIT: I should probably mention that the question is generally a floating point number and since most of the hardware implements the IEEE 754 standard where the number is represented as: 1 sign bit, 8 bits biased exponent and 23 bits mantissa.
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
不是完整的解决方案,但有一些提示。
我假设您正在使用浮点,因此第 1 点请记住浮点存储为尾数和指数。由于对数的存在,平方根的指数将大约是原始数字指数的一半。
然后可以使用查找表来近似尾数,然后您可以使用几轮牛顿拉夫森轮来为 LUT 的结果提供一定的准确性。
我已经有大约 8 年没有实施过这样的事情了,但我认为我就是这样做的,并且能够在 3 或 4 个周期内得到结果。
Not a full solution, but a couple of pointers.
I assume you're working in floating point, so point 1 is remember that floating point is stored as a mantissa and exponent. The exponent of the square root will be approximately half the exponent of the original number thanks to logarithms.
Then the mantissa can be approximated with a look-up table, and then you can use a couple of newton-raphson rounds to give some accuracy to the result from the LUT.
I haven't implemented anything like this for about 8 years, but I think this is how I did it and was able to get a result in 3 or 4 cycles.
这对于快速求反平方根来说非常有用。
请查看此处。请注意,这几乎是关于最初的猜测,相当令人惊奇的文档:)
This is a great one for fast inverse-quare root.
Have a look at it here. Notice it's pretty much about the initial guess, rather amazing document :)