双精度浮点比较
我在这里有点困惑 - 当双精度值存储为不透明(二进制)字段时,双精度值的比较是否仍然可以正常工作?我面临的问题是,双精度数包含符号的前导位(即正数或负数),当它们存储为二进制数据时,我不确定它是否会被正确比较:
我想确保比较能够正常工作,因为我使用双精度作为 LevelDB 中的键元组(例如),我想保留数据局部性正数和负数。 LevelDB 仅使用不透明字段作为键,但它确实允许用户指定他/她自己的比较器。但是,我只是想确保我不指定比较器,除非我绝对需要:
// Three-way comparison function:
// if a < b: negative result
// if a > b: positive result
// else: zero result
inline int Compare(const unsigned char* a, const unsigned char* b) const
{
if (*(double*)a < *(double*)b) return -1;
if (*(double*)a > *(double*)b) return +1;
return 0;
}
I'm a little confused here- would comparison of doubles still work correctly when they're stored as opaque (binary) fields? The problem I'm facing is the fact that the double includes a leading bit for the sign (i.e. positive or negative) and when they're stored as binary data I'm not sure it will be compared correctly:
I want to ensure that the comparison will work correctly, because I'm using a double as a part of a key tuple (e.g. ) in LevelDB and I want to preserve the data locality for positive and negative numbers. LevelDB only uses opaque fields as keys, but it does allow the user to specify his/her own comparator. However, I just want to make sure that I don't specify a comparator unless I absolutely need to:
// Three-way comparison function:
// if a < b: negative result
// if a > b: positive result
// else: zero result
inline int Compare(const unsigned char* a, const unsigned char* b) const
{
if (*(double*)a < *(double*)b) return -1;
if (*(double*)a > *(double*)b) return +1;
return 0;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
让我的评论成为答案。
有两件事可能会出错:
如果其中一个(或两个)参数为
NAN
,则比较将始终返回 false。因此,即使二进制表示相同,NAN == NAN
也始终为 false。此外,它违反了比较传递性。如果任一参数未正确对齐(因为它们是字符指针),则可能会在不支持未对齐内存访问的计算机上遇到问题。对于那些这样做的人,您可能会遇到性能影响。
因此,要解决此问题,您需要添加一个陷阱情况,如果任一参数为
NAN
,则将调用该陷阱情况。 (我不确定INF
的状态。)由于需要这种陷阱情况,您需要定义自己的比较运算符。
Making my comments an answer.
There are two things that could go wrong:
If either (or both) parameters is
NAN
, comparisons will always return false. So even if the binary representation is the same,NAN == NAN
will always be false. Furthermore, it violates comparison transitivity.If either parameter isn't properly aligned (since they are char pointers), you could run into problems on machines that don't support misaligned memory access. And for those that do, you may encounter a performance hit.
So to get around this problem, you'll need to add a trap case that will be invoked if either parameter turns out to be
NAN
. (I'm not sure on the status ofINF
.)Because of the need for this trap case, you will need to define your own comparison operator.
是的,您必须指定自己的比较函数。这是因为双精度数不一定存储为“大端”值。即使当值以大端格式写出时,逻辑上它出现在尾数之前,指数也不会驻留在尾数之前的内存中。
当然,如果您在同一数据库中的不同 CPU 架构之间共享内容,那么您最终可能会遇到奇怪的字节序问题,因为您将内容存储为二进制 blob。
最后,即使你可以控制字节序,我仍然不相信它。例如,如果双精度数未标准化,则在作为二进制数据进行比较时,它可能无法正确地与另一个双精度数进行比较。
当然,在编写比较函数时,其他人所说的关于对齐和奇数(如 NAN 和 INF)的所有内容都是需要注意的。但是,至于你是否应该写一个,我不得不说这将是一个非常好的主意。
Yes, you have to specify your own comparison function. This is because doubles are not necessarily stored as 'big-endian' values. The exponent will not reside in memory before the mantissa even though logically it appears before the mantissa when the value is written out in big-endian format.
Of course, if you're sharing stuff between different CPU architectures in the same database, you may end up with weird endian problems anyway just because you stored stuff as binary blobs.
Lastly, even if you could control for endianness I would still not trust it. For example, if a double is not normalized it may not compare correctly to another double when compared as binary data.
Of course, everything the other person said about alignment and odd values like NAN and INF are important to pay attention to when writing a comparison function. But, as far as whether you should write one at all, I would have to say that it would be a really good idea.
我假设您的数字格式符合 IEEE 754 标准。如果是这种情况,那么简单的有符号整数比较将不起作用 - 如果两个数字都是负数,则比较结果相反。所以你必须提供自己的比较器。
I assume that your number format conforms to the IEEE 754 standard. If that's the case, then a simple signed-integer comparison won't work -- if both numbers are negative, the result of the comparison is reversed. So you do have to provide your own comparator.