Java map / nio / NFS 问题导致 VM 故障:“编译的 Java 代码中最近的不安全内存访问操作发生故障”

发布于 2024-09-03 21:34:20 字数 1942 浏览 1 评论 0原文

我已经为特定的二进制格式编写了一个解析器类(nfdump,如果有人感兴趣),它使用 java.nio MappedByteBuffer 读取以下文件每个几GB。二进制格式只是一系列标头和大部分固定大小的二进制记录,通过调用 nextRecord() 将其提供给被调用者,该状态机会推送状态机,完成后返回 null。它表现良好。它可以在开发机器上运行。

在我的生产主机上,它可以运行几分钟或几个小时,但似乎总是抛出“java.lang.InternalError:编译的 Java 代码中最近不安全的内存访问操作中发生故障”,指着 Map.getInt 之一、getShort方法,即map中的读操作。

设置映射的无争议(?)代码是这样的:

    /** Set up the map from the given filename and position */
    protected void open() throws IOException {
            // Set up buffer, is this all the flexibility we'll need?
            channel = new FileInputStream(file).getChannel();    
            MappedByteBuffer map1 = channel.map(FileChannel.MapMode.READ_ONLY, 0, channel.size());
            map1.load(); // we want the whole thing, plus seems to reduce frequency of crashes?
            map = map1;
            // assumes the host  writing the files is little-endian (x86), ought to be configurable
            map.order(java.nio.ByteOrder.LITTLE_ENDIAN);
            map.position(position);
    }

然后我使用各种 map.get* 方法来读取短整型、整数、长整型和其他字节序列,然后到达文件末尾并关闭映射。

我从未见过我的开发主机上抛出异常。但我的生产主机和开发主机之间的显着区别在于,在前者上,我通过 NFS 读取这些文件的序列(最终可能为 6-8TB,仍在增长)。在我的开发计算机上,我在本地选择了较小的这些文件 (60GB),但当它在生产主机上崩溃时,通常会在它达到 60GB 数据之前就发生。

两台机器都运行 java 1.6.0_20-b02,尽管生产主机运行 Debian/lenny,但开发主机运行 Ubuntu/karmic。我不相信这会产生任何影响。两台机器都有 16GB RAM,并且使用相同的 java 堆设置运行。

我认为,如果我的代码中存在错误,那么 JVM 中的错误就足够多,不会引发适当的异常!但我认为这只是由于 NFS 和 mmap 之间的交互而导致的一个特定的 JVM 实现错误,可能是 6244515 已正式修复。

我已经尝试添加“加载”调用来强制 MappedByteBuffer 将其内容加载到 RAM 中 - 这似乎延迟了我所做的一次测试运行中的错误,但并没有阻止它。也可能是巧合,这是它坠毁前所经历的最长时间!

如果您已经读到这里并且之前使用 java.nio 做过这种事情,您的直觉会是什么?现在我的目标是在没有 nio 的情况下重写它:)

I have written a parser class for a particular binary format (nfdump if anyone is interested) which uses java.nio's MappedByteBuffer to read through files of a few GB each. The binary format is just a series of headers and mostly fixed-size binary records, which are fed out to the called by calling nextRecord(), which pushes on the state machine, returning null when it's done. It performs well. It works on a development machine.

On my production host, it can run for a few minutes or hours, but always seems to throw "java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code", fingering one of the Map.getInt, getShort methods, i.e. a read operation in the map.

The uncontroversial (?) code that sets up the map is this:

    /** Set up the map from the given filename and position */
    protected void open() throws IOException {
            // Set up buffer, is this all the flexibility we'll need?
            channel = new FileInputStream(file).getChannel();    
            MappedByteBuffer map1 = channel.map(FileChannel.MapMode.READ_ONLY, 0, channel.size());
            map1.load(); // we want the whole thing, plus seems to reduce frequency of crashes?
            map = map1;
            // assumes the host  writing the files is little-endian (x86), ought to be configurable
            map.order(java.nio.ByteOrder.LITTLE_ENDIAN);
            map.position(position);
    }

and then I use the various map.get* methods to read shorts, ints, longs and other sequences of bytes, before hitting the end of the file and closing the map.

I've never seen the exception thrown on my development host. But the significant point of difference between my production host and development is that on the former, I am reading sequences of these files over NFS (probably 6-8TB eventually, still growing). On my dev machine, I have a smaller selection of these files locally (60GB), but when it blows up on the production host it's usually well before it gets to 60GB of data.

Both machines are running java 1.6.0_20-b02, though the production host is running Debian/lenny, the dev host is Ubuntu/karmic. I'm not convinced that will make any difference. Both machines have 16GB RAM, and are running with the same java heap settings.

I take the view that if there is a bug in my code, there is enough of a bug in the JVM not to throw me a proper exception! But I think it is just a particular JVM implementation bug due to interactions between NFS and mmap, possibly a recurrence of 6244515 which is officially fixed.

I already tried adding in a "load" call to force the MappedByteBuffer to load its contents into RAM - this seemed to delay the error in the one test run I've done, but not prevent it. Or it could be coincidence that was the longest it had gone before crashing!

If you've read this far and have done this kind of thing with java.nio before, what would your instinct be? Right now mine is to rewrite it without nio :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

爱冒险 2024-09-10 21:34:20

我会重写它而不使用 ma​​pped NIO。如果您正在处理多个文件,则会出现一个问题,即映射的内存永远不会被释放,因此您将耗尽虚拟内存:注意,这不一定只是与垃圾收集器交互的 OutOfMemoryError,它可能是一个无法分配新的映射缓冲区。我会使用 FileChannel。

话虽如此,对 NFS 文件的大规模操作总是会出现极大的问题。最好重新设计系统,以便每个文件都由其本地 CPU 读取。通过这种方式,您还将获得巨大的速度提升,远远超过不使用映射缓冲区所损失的 20%。

I would rewrite it without using mapped NIO. If you're dealing with more than one file there is a problem that the mapped memory is never released so you will run out of virtual memory: NB this isn't necessarily just an OutOfMemoryError which interacts with the garbage collector, it would be a failure to allocate the new mapped buffer. I would use a FileChannel.

Having said that, large-scale operations on NFS files are always extremely problematic. You would be much better off redesigning the system so that each file is read by its local CPU. You will also get immense speed improvements this way, far more than the 20% you will lose by not using mapped buffers.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文