将双精度型转换为整数以提高速度

发布于 2024-09-01 19:34:09 字数 1060 浏览 4 评论 0原文

在 Redis (http://code.google.com/p/redis) 中,有分数关联到元素,以便对该元素进行排序。即使许多用户实际上按整数排序(例如 unix 时间),该分数也是双精度的。

当数据库保存时我们需要将这个双打ok写入磁盘。这是当前使用的:

  snprintf((char*)buf+1,sizeof(buf)-1,"%.17g",val);

另外检查无穷大和非数字条件,以便也在最终的数据库文件中表示这一点。

不幸的是,将双精度数转换为字符串表示形式非常慢。虽然 Redis 中有一个函数可以更快地将整数转换为字符串表示形式。所以我的想法是检查是否可以将双精度数转换为整数而不丢失数据,然后使用该函数将整数转换为字符串(如果是这样)。

为了提供良好的加速,当然整数“等价性”的测试必须很快。所以我使用了一个可能是未定义行为的技巧,但在实践中效果很好。类似这样的:

double x = ... some value ...
if (x == (double)((long long)x))
    use_the_fast_integer_function((long long)x);
else
    use_the_slow_snprintf(x);

在我的推理中,上面的双精度转换将双精度转换为长整型,然后再转换回整数。如果范围合适,并且没有小数部分,则该数字将在转换后保留下来,并且与初始数字完全相同。

因为我想确保这不会破坏某些系统中的东西,所以我在 freenode 上加入了#c,但我受到了很多侮辱;)所以我现在在这里尝试。

有没有一种标准方法可以在不超出 ANSI C 的情况下完成我想做的事情?否则,上面的代码是否应该在当前 Redis 目标的所有 Posix 系统中工作?也就是说,现在运行 Linux / Mac OS X / *BSD / Solaris 的拱门?

为了使代码更清晰,我可以添加的是在尝试强制转换之前对双精度数的范围进行显式检查。

感谢您的帮助。

in Redis (http://code.google.com/p/redis) there are scores associated to elements, in order to take this elements sorted. This scores are doubles, even if many users actually sort by integers (for instance unix times).

When the database is saved we need to write this doubles ok disk. This is what is used currently:

  snprintf((char*)buf+1,sizeof(buf)-1,"%.17g",val);

Additionally infinity and not-a-number conditions are checked in order to also represent this in the final database file.

Unfortunately converting a double into the string representation is pretty slow. While we have a function in Redis that converts an integer into a string representation in a much faster way. So my idea was to check if a double could be casted into an integer without lost of data, and then using the function to turn the integer into a string if this is true.

For this to provide a good speedup of course the test for integer "equivalence" must be fast. So I used a trick that is probably undefined behavior but that worked very well in practice. Something like that:

double x = ... some value ...
if (x == (double)((long long)x))
    use_the_fast_integer_function((long long)x);
else
    use_the_slow_snprintf(x);

In my reasoning the double casting above converts the double into a long, and then back into an integer. If the range fits, and there is no decimal part, the number will survive the conversion and will be exactly the same as the initial number.

As I wanted to make sure this will not break things in some system, I joined #c on freenode and I got a lot of insults ;) So I'm now trying here.

Is there a standard way to do what I'm trying to do without going outside ANSI C? Otherwise, is the above code supposed to work in all the Posix systems that currently Redis targets? That is, archs where Linux / Mac OS X / *BSD / Solaris are running nowaday?

What I can add in order to make the code saner is an explicit check for the range of the double before trying the cast at all.

Thank you for any help.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

情深如许 2024-09-08 19:34:09

也许一些老式的定点数学可以帮助你。如果将双精度值转换为定点值,您仍然可以获得小数精度,并且转换为字符串与整数一样简单,只需添加一个移位函数即可。

另一个想法是推出您自己的 snprintf() 函数。许多 FPU 单元本身支持从 double 到 int 的转换,因此速度应该快如闪电。将其转换为字符串也很简单。

只是给你一些随机的想法。

Perhaps some old fashion fixed point math could help you out. If you converted your double to a fixed point value, you still get decimal precision and converting to a string is as easy as with ints with the addition of a single shift function.

Another thought would be to roll your own snprintf() function. Doing the conversion from double to int is natively supported by many FPU units so that should be lightning fast. Converting that to a string is simple as well.

Just a few random ideas for you.

ら栖息 2024-09-08 19:34:09

这样做的问题是比较不会按照您期望的方式进行。仅仅因为一个浮点值小于另一个浮点值并不意味着它的整数表示将小于另一个浮点值。另外,我看到您比较(以前的)双精度值之一是否相等。由于低位中的舍入和表示错误,您几乎永远不想这样做。

如果您只是寻找某种密钥来执行哈希之类的操作,那么它可能会很好。如果你真的关心哪些值真正具有更大或更小的价值,那么这是一个坏主意。

The problem with doing that is that the comparisons won't work out the way you'd expect. Just because one floating point value is less than another doesn't mean that its representation as an integer will be less than the other's. Also, I see you comparing one of the (former) double values for equality. Due to rounding and representation errors in the low-order bits, you almost never want to do that.

If you are just looking for some kind of key to do something like hashing on, it would probably work out fine. If you actually care about which values really have greater or lesser value, its a bad idea.

短暂陪伴 2024-09-08 19:34:09

只要 x 在 long long 的范围内,我就没有看到强制转换的问题。也许您应该查看 modf() 函数,它将双精度数分为整数部分和小数部分。然后,您可以添加针对 (double)LLONG_MIN 和 (double)LLONG_MAX 的检查来确定整数部分。尽管双精度可能存在困难。

但在做任何事情之前,您是否通过测量其性能来确定它确实是一个瓶颈?整数值的百分比是否足够高,足以真正产生影响?

I don't see a problem with the casts, as long as x is within the range of long long. Maybe you should check out the modf() function which separates a double into its integral and fractional part. You can then add checks against (double)LLONG_MIN and (double)LLONG_MAX for the integral part to make sure. Though there may be difficulties with the precision of double.

But before doing anything of this, have you made sure it actually is a bottleneck by measuring its performance? And is the percentage of integer values high enough that it would really make a difference?

白龙吟 2024-09-08 19:34:09

您的测试完全没问题(假设此时您已经分别处理了无穷大和 NAN) - 并且这可能是您确实想要比较浮点数是否相等的极少数情况之一。它不会调用未定义的行为 - 即使 x 超出了 long long 的范围,您也只会得到一个“实现定义的结果”,这是可以的这里。

唯一美中不足的是负零最终会变成正零(因为负零与正零比较等于)。

Your test is perfectly fine (assuming you have already separately handled infinities and NANs by this point) - and it's probably one of the very few occaisions when you really do want to compare floats for equality. It doesn't invoke undefined behaviour - even if x is outside of the range of long long, you'll just get an "implementation-defined result", which is OK here.

The only fly in the ointment is that negative zero will end up as positive zero (because negative zero compares equal to positive zero).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文