最终的、非规范的 NaN 双精度值在运行时发生变化
我正在编写与 R 交互的 Java 代码,其中“NA”值与 NaN 值不同。 NA 表示某个值“统计缺失”,即该值无法收集或不可用。
class DoubleVector {
public static final double NA = Double.longBitsToDouble(0x7ff0000000001954L);
public static boolean isNA(double input) {
return Double.doubleToRawLongBits(input) == Double.doubleToRawLongBits(NA);
}
/// ...
}
以下单元测试演示了 NaN 和 NA 之间的关系,并且在我的 Windows 笔记本电脑上运行良好,但“isNA(NA) #2”在我的 ubuntu 工作站上有时会失败。
@Test
public void test() {
assertFalse("isNA(NaN) #1", DoubleVector.isNA(DoubleVector.NaN));
assertTrue("isNaN(NaN)", Double.isNaN(DoubleVector.NaN));
assertTrue("isNaN(NA)", Double.isNaN(DoubleVector.NA));
assertTrue("isNA(NA) #2", DoubleVector.isNA(DoubleVector.NA));
assertFalse("isNA(NaN)", DoubleVector.isNA(DoubleVector.NaN));
}
从调试来看,DoubleVector.NA 似乎已更改为规范的 NaN 值 7ff8000000000000L,但很难判断,因为将其打印到 stdout 会给出与调试器不同的值。
此外,只有在之前进行了许多其他测试之后才运行测试,该测试才会失败。如果我单独运行这个测试,它总是会通过。
这是 JVM 错误吗?优化的副作用?
测试总是通过:
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) Client VM (build 19.1-b02, mixed mode, sharing)
测试有时会失败:
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
I am writing Java code that interacts with R, where "NA" values are distinguished from NaN values. NA indicates that a value is "statistically missing", that is it could not collected or is otherwise not available.
class DoubleVector {
public static final double NA = Double.longBitsToDouble(0x7ff0000000001954L);
public static boolean isNA(double input) {
return Double.doubleToRawLongBits(input) == Double.doubleToRawLongBits(NA);
}
/// ...
}
The following unit test demonstrates the relationship between NaN and NA and runs fine on my windows laptop but "isNA(NA) #2" fails sometimes on my ubuntu workstation.
@Test
public void test() {
assertFalse("isNA(NaN) #1", DoubleVector.isNA(DoubleVector.NaN));
assertTrue("isNaN(NaN)", Double.isNaN(DoubleVector.NaN));
assertTrue("isNaN(NA)", Double.isNaN(DoubleVector.NA));
assertTrue("isNA(NA) #2", DoubleVector.isNA(DoubleVector.NA));
assertFalse("isNA(NaN)", DoubleVector.isNA(DoubleVector.NaN));
}
From debugging, it appears that DoubleVector.NA is changed to the canonical NaN value 7ff8000000000000L, but it's hard to tell because printing it to stdout gives different values than the debugger.
Also, the test only fails if it runs after a number of other previous tests; if I run this test alone, it always passes.
Is this a JVM bug? A side effect of optimization?
Tests always pass on:
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) Client VM (build 19.1-b02, mixed mode, sharing)
Tests sometimes fail on:
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您在这里正踏入非常危险的水域,这是 Java VM 行为未准确指定的少数领域之一。
根据 JVM 规范,
double
范围内只有“NaN 值”。双精度数上的算术运算无法区分两个不同的NaN
值。longBitsToDouble 的文档()
有这样的注释:因此,假设处理
double
值将始终保持特定NaN
值完好无损,这是一件危险的事情。最干净的解决方案是将数据存储在
long
中,并在检查特殊值后将其转换为double
。然而,这将对性能产生相当明显的影响。您可能可以通过在受影响的位置添加
strictfp
标志来逃脱。这并不能以任何方式保证它会工作,但它(可能)会改变你的JVM处理浮点值的方式,可能只是必要的提示,可以帮助。然而,它仍然不便于携带。You are treading in very dangerous water here, one of the few areas where the Java VM behaviour is not exactly specified.
According to the JVM spec, there is only "a NaN value" in the
double
range. No arithmetic operation on doubles could distinguish between two differentNaN
values.The documentation of
longBitsToDouble()
has this note:So assuming that handling a
double
value will always keep the specificNaN
value intact is a dangerous thing.The cleanest solution would be to store your data in
long
and convert todouble
after checking for your special value. This will impose a quite noticeable performance impact, however.You might get away by adding the
strictfp
flag at the affected places. This doesn't in any way guarantee that it will work, but it will (possibly) change how your JVM handles floating point values and might just be the necessary hint that helps. It will still not be portable, however.