截断内存映射文件

发布于 2024-11-07 07:45:45 字数 818 浏览 6 评论 0原文

我正在对索引文件使用内存映射 IO,但问题是如果文件大部分为空,我无法调整文件大小。

之前的某个地方:

MappedByteBuffer map = raf.getChannel().map(MapMode.READ_WRITE, 0, 1 << 30);
raf.close();
// use map
map.force();
map = null;

调整大小:

for (int c = 0; c < 100; c++) {
    RandomAccessFile raf = new RandomAccessFile(indexFile, "rw");
    try {
        raf.setLength(newLen);
        if (c > 0) LOG.warn("used " + c + " iterations to close mapped byte buffer");
        return;
    } catch (Exception e) {
        System.gc();
        Thread.sleep(10);
        System.runFinalization();
        Thread.sleep(10);
    } finally {
        raf.close();
    }
}

当使用 Windows 或 Linux 32 位时,我经常遇到取消映射问题,但在 64 位 Linux 生产环境中,一切似乎都可以正常工作,没有警告,但文件保持原始大小。

谁能解释为什么会发生这种情况和/或如何解决问题?

I am using memory mapped IO for an index file, but the problem is that I'm not able to resize the file if it is mostly empty.

Somewhere before:

MappedByteBuffer map = raf.getChannel().map(MapMode.READ_WRITE, 0, 1 << 30);
raf.close();
// use map
map.force();
map = null;

Resize:

for (int c = 0; c < 100; c++) {
    RandomAccessFile raf = new RandomAccessFile(indexFile, "rw");
    try {
        raf.setLength(newLen);
        if (c > 0) LOG.warn("used " + c + " iterations to close mapped byte buffer");
        return;
    } catch (Exception e) {
        System.gc();
        Thread.sleep(10);
        System.runFinalization();
        Thread.sleep(10);
    } finally {
        raf.close();
    }
}

When using Windows or Linux 32-bit I often have the unmapping problem, but in the 64 bit Linux production environment everything seems to work without warnings, but the file keeps the original size.

Can anyone explain why this happens and/or how to solve the problem?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

萌面超妹 2024-11-14 07:45:45

您的问题是您使用不可靠的方法来关闭映射的字节缓冲区(对 System.gc()System.runFinalization() 的一百次调用不能向您保证任何事情)。不幸的是,Java API 中没有可靠的方法来做到这一点,但在 Sun JVM(也许还有其他一些)上,您可以使用以下代码:

public void unmapMmaped(ByteBuffer buffer) {
  if (buffer instanceof sun.nio.ch.DirectBuffer) {
    sun.misc.Cleaner cleaner = ((sun.nio.ch.DirectBuffer) buffer).cleaner();
    cleaner.clean();
  }
}

当然,它依赖于 JVM,如果 Sun曾经决定以不兼容的方式更改 sun.nio.ch.DirectBuffersun.misc.Cleaner (但实际上我不相信这种情况会发生)。

Your issue is that you are using unreliable method to close mapped byte buffer (one hundred calls to System.gc() and System.runFinalization() don't guarantee you anything). Unfortunately there is no reliable method in Java API to do that, but on Sun JVM (and perhaps on some others too) you can use the following code:

public void unmapMmaped(ByteBuffer buffer) {
  if (buffer instanceof sun.nio.ch.DirectBuffer) {
    sun.misc.Cleaner cleaner = ((sun.nio.ch.DirectBuffer) buffer).cleaner();
    cleaner.clean();
  }
}

Of course it is JVM-dependent and you should be ready to fix your code if Sun ever decides to change sun.nio.ch.DirectBuffer or sun.misc.Cleaner in an incompatible manner (but actually I don't believe this will ever happen).

谁把谁当真 2024-11-14 07:45:45

这只是对之前答案的补充,这是完全正确的。

JDK 1.7抱怨sun.misc.Cleaner的使用,称这个命名空间中的类不是JDK的正式组成部分,将来可能会消失。然而,从 1.7 开始,它们仍然存在。

如果 .clean() 方法不可用,则可以使用 System.gc() 作为后备方法,但是必须承认这是一种“hack”因此必须小心。

虽然 System.gc() 无法强制关闭未引用的映射,但实际上它通常会导致清理发生。 32 位 Linux(和 Solaris)上的经验表明,在第一次或第二次调用 System.gc() 期间,每次测试都会释放缓冲区。但是,Windows 上的行为有所不同。大多数情况下,所有映射都会在第二次调用 System.gc() 结束时释放,但有时需要 3 次调用。仍然存在需要更多呼叫的情况,而对更多呼叫的要求的频率却在减少。这可能具有欺骗性,因为测试可能表明只需要 4 次调用,但一个月后却失败了。 5 次调用看似足够,但 6 个月后却导致失败。

可以通过使用 FileChannel.truncate() 周围的 try/catch 块来测试地图是否已发布,并使用一个循环来重新尝试操作失败。循环不能是无限的,因为在某些病态情况下,特定的堆配置将导致垃圾收集器永远不会清理映射。然而,大约 10 次的循环将涵盖几乎所有情况。如果到那时该对象还没有消失,那么它就不会去任何地方,并且应用程序将不得不放弃。这可能看起来不够,但实际上,这是极不可能的,并且只会是不支持清理器的 JVM 上的问题。

This is just a supplement to the previous answer, which is completely correct.

JDK 1.7 complains about the use of sun.misc.Cleaner, saying that classes in this namespace are not a formal part of the JDK, and may disappear in the future. However, as of 1.7 they are still present.

If the .clean() method is unavailable, then using System.gc() can be used as a fallback method, however this must be acknowledged to be a "hack" and care must therefore be used.

While System.gc() cannot force an unreferenced mapping to be closed, in practice it will often cause cleanup to happen. Experience on 32-bit Linux (and Solaris) shows buffers being released during every test during either the first or second call to System.gc(). However, the behavior on Windows is different. In most cases, all mappings are released by the end of the second call to System.gc(), but sometimes it requires 3 calls. There are still occasions where more calls are required, with a requirement for a higher number of calls diminishing in frequency. This can be deceptive, in that tests may indicate that 4 calls are all that is required, only to have it fail on you a month later. 5 calls may then seem adequate, only to lead to failure in 6 months.

Testing to see if a map has been released can be done by using a try/catch block around FileChannel.truncate(), with a loop to re-attempt the operation on failure. The loop cannot be infinite, as there are pathological cases where a particular heap configuration will lead the garbage collector to never clean up a mapping. However, a loop of about 10 will cover almost all cases. If the object isn't gone by that point, then it's not going anywhere and the application will have to give up. That may seem inadequate, but in practice, it is extremely unlikely, and will only be an issue on a JVM that doesn't support cleaners.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文