C# 中是否有一个好的浮点基数排序实现
我有一个带有浮点类型字段的数据结构。这些结构体的集合需要按浮点值排序。是否有一个基数排序实现。
如果没有,是否有一种快速的方法来访问指数、符号和尾数。 因为如果你首先对尾数、指数和最后一次的指数对浮点数进行排序。您可以在 O(n) 时间内对浮点数进行排序。
I have a datastructure with a field of the float-type. A collection of these structures needs to be sorted by the value of the float. Is there a radix-sort implementation for this.
If there isn't, is there a fast way to access the exponent, the sign and the mantissa.
Because if you sort the floats first on mantissa, exponent, and on exponent the last time. You sort floats in O(n).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
更新:
我对这个主题非常感兴趣,所以我坐下来实现了它(使用 这种非常快速且节省内存的实现)。我还阅读了这个(感谢celion),发现您甚至不必将浮点数拆分为尾数和指数来对其进行排序。您只需一对一地获取这些位并执行 int 排序。你只需要关心负值,它们必须在算法结束时相反地放在正值前面(我在算法的最后一次迭代中一步完成了这一点,以节省一些 CPU 时间)。
这是我的浮点基数排序:
它比 int 基数排序稍慢,因为在函数的开头和结尾进行数组复制,其中浮点数按位复制到整数并返回。尽管如此,整个函数仍然是 O(n)。无论如何,比像你建议的那样连续排序 3 次要快得多。我看不到太多优化空间,但如果有人这样做:请随时告诉我。
要按降序排序,请在最后更改此行:
对此:
测量:
我设置了一些简短的测试,包含浮点数的所有特殊情况(NaN、+/-Inf、最小/最大值、0 )和随机数。它的排序顺序与 Linq 或 Array.Sort 对浮点进行排序的顺序完全相同:
所以我用 10M 数字的巨大数组进行了测试:
并停止了不同排序算法的时间:
输出是( 更新:现在使用发布版本运行,而不是调试):
大约比 Linq 快四倍多。那还不错。但仍然没有 Array.Sort 那么快,但也没有差多少。但我对这个感到非常惊讶:我预计它在非常小的数组上会比 Linq 稍微慢一些。但后来我运行了一个仅包含 20 个元素的测试:
即使这次我的 Radixsort 比 Linq 更快,但比数组排序慢方式。 :)
更新 2:
我做了一些更多的测量,发现了一些有趣的事情:较长的组长度常量意味着更少的迭代和更多的内存使用。如果您使用 16 位的组长度(仅 2 次迭代),则在对小数组进行排序时会产生巨大的内存开销,但如果涉及大于大约 100k 元素的数组,则可以击败 Array.Sort 。 ,即使不是很多。图表轴都是对数的:
(来源:daubmeier.de)
Update:
I was quite interested in this topic, so I sat down and implemented it (using this very fast and memory conservative implementation). I also read this one (thanks celion) and found out that you even dont have to split the floats into mantissa and exponent to sort it. You just have to take the bits one-to-one and perform an int sort. You just have to care about the negative values, that have to be inversely put in front of the positive ones at the end of the algorithm (I made that in one step with the last iteration of the algorithm to save some cpu time).
So heres my float radixsort:
It is slightly slower than an int radix sort, because of the array copying at the beginning and end of the function, where the floats are bitwise copied to ints and back. The whole function nevertheless is again O(n). In any case much faster than sorting 3 times in a row like you proposed. I dont see much room for optimizations anymore, but if anyone does: feel free to tell me.
To sort descending change this line at the very end:
to this:
Measuring:
I set up some short test, containing all special cases of floats (NaN, +/-Inf, Min/Max value, 0) and random numbers. It sorts exactly the same order as Linq or
Array.Sort
sorts floats:So i ran a test with a huge array of 10M numbers:
And stopped the time of the different sorting algorithms:
And the output was (update: now ran with release build, not debug):
roughly more than four times as fast as Linq. That is not bad. But still not yet that fast as
Array.Sort
, but also not that much worse. But i was really surprised by this one: I expected it to be slightly slower than Linq on very small arrays. But then I ran a test with just 20 elements:and even this time my Radixsort is quicker than Linq, but way slower than array sort. :)
Update 2:
I made some more measurements and found out some interesting things: longer group length constants mean less iterations and more memory usage. If you use a group length of 16 bits (only 2 iterations), you have a huge memory overhead when sorting small arrays, but you can beat
Array.Sort
if it comes to arrays larger than about 100k elements, even if not very much. The charts axes are both logarithmized:(source: daubmeier.de)
这里有一个关于如何对浮点数执行基数排序的很好的解释:
http://www.codercorner.com/RadixSortRevisited.htm
如果您的所有值都是正数,你可以不用使用二进制表示法;该链接解释了如何处理负值。
There's a nice explanation of how to perform radix sort on floats here:
http://www.codercorner.com/RadixSortRevisited.htm
If all your values are positive, you can get away with using the binary representation; the link explains how to handle negative values.
通过进行一些奇特的转换和交换数组而不是复制,此版本对于 10M 数字的速度比 Philip Daubmeiers 原始版本(组长度设置为 8)快 2 倍。对于该数组大小,它比 Array.Sort 快 3 倍。
By doing some fancy casting and swapping arrays instead of copying this version is 2x faster for 10M numbers as Philip Daubmeiers original with grouplength set to 8. It is 3x faster as Array.Sort for that arraysize.
您可以使用
unsafe
块进行 memcpy 或将float *
别名为uint *
来提取位。You can use an
unsafe
block to memcpy or alias afloat *
to auint *
to extract the bits.我认为如果这些值不太接近并且有合理的精度要求,您最好的选择是,您可以仅使用小数点前后的实际浮点数字来进行排序。
例如,您可以只使用前 4 位小数(无论是否为 0)进行排序。
I think your best bet if the values aren't too close and there's a reasonable precision requirement, you can just use the actual float digits before and after the decimal point to do the sorting.
For example, you can just use the first 4 decimals (be they 0 or not) to do the sorting.