Java 中的连续页面/物理内存

发布于 2024-07-18 05:47:58 字数 467 浏览 5 评论 0原文

我的目标是确保在java中分配的数组是跨连续的物理内存分配的。 我遇到的问题是,数组分配的页面在物理内存中往往不是连续的,除非我分配一个非常大的数组。

我的问题是:

  • 为什么一个非常大的数组可以确保 物理内存中连续的页?
  • 有没有什么方法可以确保数组在物理内存中分配,而不涉及使数组变得很大?
  • 如何在不测量缓存命中/缓存未命中的情况下判断 Java 对象/数组存在于哪个页面或物理地址?

我不是在寻找答案来询问为什么我在 java 中这样做。 我知道 C 会“解决我的问题”,并且我违背了 java 的基本性质。 尽管如此,我这样做还是有充分的理由的。

答案不必保证始终有效。 我正在寻找大多数时候有效的答案。 对于任何一个理性的 Java 程序员都不会写出的创造性的、开箱即用的答案,可以加分。 特定于平台(x86 32 位 64 位)是可以的。

My goal is to ensure that an array allocated in java is allocated across contiguous physical memory. The issue that I've run into is that the pages an array is allocated across tend not to be contiguous in physical memory, unless I allocate a really large array.

My questions are:

  • Why does a really large array ensure
    pages which are contiguous in physical memory?
  • Is there any way to ensure an array is allocated across physical memory, that doesn't involve making the array really large?
  • How can I tell what page or physical address a Java object/array exists in, without measuring cache hits/cache misses?

I'm not looking for answers asking why I am doing this in java. I understand that C would "solve my problem", and that I'm going against the fundamental nature of java. Nevertheless I have a good reason for doing this.

The answers need not be guaranteed to work all the time. I am looking for answers that work most of the time. Extra points for creative, out-of-the-box answers that no reasonable Java programmer would ever write. It's OK to be platform specific(x86 32-bit 64-bit).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

嘿咻 2024-07-25 05:47:58

不可以。物理上连续的内存需要与操作系统直接交互。 大多数应用程序(包括 JVM)仅获取虚拟连续的地址。 JVM 无法为您提供它无法从操作系统获得的东西。

另外,你为什么想要它? 如果您要设置 DMA 传输,那么您可能正在使用 Java 以外的技术。

背景知识

现代 PC 中的物理内存通常是灵活的数量,位于可更换的 DIMM 模块上。 它的每个字节都有一个物理地址,因此操作系统在引导期间确定哪些物理地址可用。 事实证明,应用程序不直接使用这些地址会更好。 相反,所有现代 CPU(及其缓存)都使用虚拟地址。 存在到物理地址的映射表,但这不需要是完整的——通过使用未映射到物理地址的虚拟地址来启用到磁盘的交换。 每个进程有一个表(具有不完整的映射)可以获得另一级别的灵活性。 如果进程A有一个映射到物理地址X的虚拟地址,但进程B没有,那么进程B就无法写入物理地址X,我们可以认为该内存是进程A独占的。显然为了安全起见,操作系统必须保护对映射表的访问,但所有现代操作系统都这样做。

映射表在页级别工作。 页或物理地址的连续子集被映射到虚拟地址的连续子集。 开销和粒度之间的权衡导致 4KB 页面成为常见的页面大小。 但由于每个页面都有自己的映射,因此不能假设超出该页面大小的连续性。 特别是,当页面从物理内存中逐出、交换到磁盘并恢复时,很可能最终会到达新的物理内存地址。 程序不会注意到,因为虚拟地址没有改变,只有操作系统管理的映射表改变了。

No. Physically contiguous memory requires direct interaction with the OS. Most applications, JVM included only get virtually contiguous addresses. And a JVM cannot give to you what it doesn't get from the OS.

Besides, why would you want it? If you're setting up DMA transfers, you probably are using techniques besides Java anyway.

Bit of background:

Physical memory in a modern PC is typically a flexible amount, on replacable DIMM modules. Each byte of it has a physical address, so the Operating System during boot determines which physical addresses are available. It turns out applications are better off by not using these addresses directly. Instead, all modern CPUs (and their caches) use virtual addresses. There is a mapping table to physical addresses, but this need not be complete - swap to disk is enabled by the use of virtual addresses not mapped to physical addresses. Another level of flexibility is gained from having one table per process, with incomplete mappings. If process A has a virtual address that maps to physical address X, but process B doesn't, then there is no way that process B can write to physical address X, and we can consider that memory to be exclusive to process A. Obviously for this to be safe, the OS has to protect access to mapping table, but all modern OSes do.

The mapping table works at the page level. A page, or contiguous subset of physical addresses is mapped to a contiguous subset of virtual addresses. The tradeoff between overhead and granularity has resulted in 4KB pages being a common page size. But as each page has its own mapping, one cannot assume contiguity beyond that page size. In particular, when pages are evicted from physical memory, swapped to disk, and restored, it's quite possible that the end up at a new physical memory address. The program doesn't notice, as the virtual address does not change, only the OS-managed mapping table does.

舂唻埖巳落 2024-07-25 05:47:58

鉴于垃圾收集器在(逻辑)内存中移动对象,我认为您会不走运。

您能做的最好的事情就是使用 ByteBuffer.allocateDirect。 (通常)GC 不会在(逻辑)内存中移动它,但它可能会移动到物理内存中,甚至分页到磁盘上。 如果你想要更好的保证,你就必须使用操作系统。

话虽如此,如果您可以将页面大小设置为与堆一样大,那么所有数组必然在物理上连续(或交换)。

Given that the garbage collector moves objects around in (logical) memory, I think you are going to be out of luck.

About the best you could do is use ByteBuffer.allocateDirect. That will (typically) not get moved around (logical) memory by the GC, but it may be moved in physical memory or even paged out to disc. If you want any better guarantees, you'll have to hit the OS.

Having said that, if you can set the page size to be as big as your heap, then all arrays will necessarily be physically contiguous (or swapped out).

浅暮の光 2024-07-25 05:47:58

我认为您会想要使用 sun.java.unsafe。

I would think that you would want to use sun.java.unsafe.

勿忘初心 2024-07-25 05:47:58

可能有一些方法可以欺骗特定的 JVM 来执行您想要的操作,但这些方法可能很脆弱、复杂,并且很可能针对 JVM、其版本、运行的操作系统等非常具体。换句话说,这是浪费精力。

因此,在不了解您的问题的更多信息的情况下,我认为没有人能够提供帮助。
一般来说,Java 中当然没有办法做到这一点,最多是在特定的 JVM 上。

建议一个替代方案:

如果您确实需要将数据存储在连续内存中,为什么不在一个小型 C 库中实现并通过 JNI 调用它呢?

There may be ways to trick a specific JVM into doing what you want, but these would probably be fragile, complicated and most likely very specific to the JVM, its version, OS it runs on etc. In other words, wasted effort.

So without knowing more about your problem, I don't think anyone will be able to help.
There certainly is no way to do it in Java in general, at most on a specific JVM.

To suggest an alternative:

If you really need to store data in contiguous memory, why not do it in a small C library and call that via JNI?

ま昔日黯然 2024-07-25 05:47:58

照我看来。 您还没有解释为什么

  • 原始数组在内存中不连续。 我不明白为什么它们在虚拟内存中不连续。 (参见对象数组不太可能使其对象在内存中连续)
  • 在物理内存(RAM 即随机存取存储器)中不连续的数组将具有显着的性能差异。 例如,应用程序性能的可测量差异。

看起来您确实在寻找一种低级方法来分配数组,因为您习惯于在 C 中执行此操作,并且性能是需要执行此操作的要求。

顺便说一句:使用 getDouble()/putDouble() 访问 ByteBuffer.allocateDirect() 可能比仅使用 double[] 更慢,因为前者涉及 JNI 调用,而后者可以优化为根本不调用。

使用它的原因是为了在 Java 和 C 空间之间交换数据。 例如 NIO 调用。 只有当读/写保持在最低限度时它才表现良好。 否则你最好使用 Java 领域的东西。

也就是说,除非您清楚自己在做什么以及为什么这样做,否则您最终可能会得到一个可能让您感觉更好的解决方案,但实际上比简单的解决方案更复杂且性能更差。

As I see it. You have yet to explain why

  • that primitive arrays are not continuous in memory. I don't see why they wouldn't be continuous in virtual memory. (c.f. Arrays of Object are unlikely have its Objects continuous in memory)
  • an array which is not continuous in physical memory (RAM i.e. Random Access Memory) would have a significant performance difference. e.g. measurable difference in the performance of your application.

What its appears is you are really looking for a low level way to allocate arrays because you are used to doing this in C, and performance is a claim for a need to do this.

BTW: Accessing ByteBuffer.allocateDirect() with say getDouble()/putDouble() can be slower that just using a double[] as the former involves JNI calls and the latter can be optimised to no call at all.

The reason it is used is for exchanging data between the Java and C spaces. e.g. NIO calls. It only performs well when read/writes are kept to a minimum. Otherwise you are better off using something in the Java space.

i.e. Unless you are clear what you are doing and why you are doing it, you can end up with a solution which might make you feel better, but actually is more complicated and performs worse than the simple solution.

不念旧人 2024-07-25 05:47:58

请注意相关问题的此答案,其中讨论了 System.identityHashCode() 和识别对象的内存地址。 最重要的是,您可以使用默认的数组 hashCode() 实现来识别数组的原始内存地址(需适合 int/32 位)

Note this answer to a related question, which discusses System.identityHashCode() and identification of the memory address of the object. The bottom line is that you can use the default array hashCode() implementation to identify the original memory address of the array (subject to fitting in an int/32-bit)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文