The on-heap store refers to objects that will be present in the Java heap (and also subject to GC). On the other hand, the off-heap store refers to (serialized) objects that are managed by EHCache, but stored outside the heap (and also not subject to GC). As the off-heap store continues to be managed in memory, it is slightly slower than the on-heap store, but still faster than the disk store.
The internal details involved in management and usage of the off-heap store aren't very evident in the link posted in the question, so it would be wise to check out the details of Terracotta BigMemory, which is used to manage the off-disk store. BigMemory (the off-heap store) is to be used to avoid the overhead of GC on a heap that is several Megabytes or Gigabytes large. BigMemory uses the memory address space of the JVM process, via direct ByteBuffers that are not subject to GC unlike other native Java objects.
Usually all non-temporary objects you allocate are managed by java's garbage collector. Although the VM does a decent job doing garbage collection, at a certain point the VM has to do a so called 'Full GC'. A full GC involves scanning the complete allocated Heap, which means GC pauses/slowdowns are proportional to an applications heap size. So don't trust any person telling you 'Memory is Cheap'. In java memory consumtion hurts performance. Additionally you may get notable pauses using heap sizes > 1 Gb. This can be nasty if you have any near-real-time stuff going on, in a cluster or grid a java process might get unresponsive and get dropped from the cluster.
However todays server applications (frequently built on top of bloaty frameworks ;-) ) easily require heaps far beyond 4Gb.
One solution to these memory requirements, is to 'offload' parts of the objects to the non-java heap (directly allocated from the OS). Fortunately java.nio provides classes to directly allocate/read and write 'unmanaged' chunks of memory (even memory mapped files).
So one can allocate large amounts of 'unmanaged' memory and use this to save objects there. In order to save arbitrary objects into unmanaged memory, the most viable solution is the use of Serialization. This means the application serializes objects into the offheap memory, later on the object can be read using deserialization.
The heap size managed by the java VM can be kept small, so GC pauses are in the millis, everybody is happy, job done.
It is clear, that the performance of such an off heap buffer depends mostly on the performance of the serialization implementation. Good news: for some reason FST-serialization is pretty fast :-).
Sample usage scenarios:
Session cache in a server application. Use a memory mapped file to store gigabytes of (inactive) user sessions. Once the user logs into your application, you can quickly access user-related data without having to deal with a database.
Caching of computational results (queries, html pages, ..) (only applicable if computation is slower than deserializing the result object ofc).
very simple and fast persistance using memory mapped files
Edit: For some scenarios one might choose more sophisticated Garbage Collection algorithms such as ConcurrentMarkAndSweep or G1 to support larger heaps (but this also has its limits beyond 16GB heaps). There is also a commercial JVM with improved 'pauseless' GC (Azul) available.
The heap is the place in memory where your dynamically allocated objects live. If you used new then it's on the heap. That's as opposed to stack space, which is where the function stack lives. If you have a local variable then that reference is on the stack.
Java's heap is subject to garbage collection and the objects are usable directly.
EHCache's off-heap storage takes your regular object off the heap, serializes it, and stores it as bytes in a chunk of memory that EHCache manages. It's like storing it to disk but it's still in RAM. The objects are not directly usable in this state, they have to be deserialized first. Also not subject to garbage collection.
Not 100%; however, it sounds like the heap is an object or set of allocated space (on RAM) that is built into the functionality of the code either Java itself or more likely functionality from ehcache itself, and the off-heap Ram is there own system as well; however, it sounds like this is one magnitude slower as it is not as organized, meaning it may not use a heap (meaning one long set of space of ram), and instead uses different address spaces likely making it slightly less efficient.
Then of course the next tier lower is hard-drive space itself.
I don't use ehcache, so you may not want to trust me, but that what is what I gathered from their documentation.
发布评论
评论(6)
堆上存储是指将出现在 Java 堆中(并且也受 GC 影响)的对象。另一方面,堆外存储是指由 EHCache 管理的(序列化)对象,但存储在堆外(并且也不受 GC)。由于堆外存储继续在内存中管理,因此它比堆上存储稍慢,但仍然比磁盘存储快。
问题中发布的链接中涉及堆外存储的管理和使用的内部详细信息并不是很明显,因此明智的做法是查看 Terracotta BigMemory,用于管理盘外存储。 BigMemory(堆外存储)用于避免 GC 在几兆字节或几千兆字节大的堆上的开销。 BigMemory 使用 JVM 进程的内存地址空间,通过直接与其他本机 Java 对象不同,ByteBuffers 不受 GC 的影响。
The on-heap store refers to objects that will be present in the Java heap (and also subject to GC). On the other hand, the off-heap store refers to (serialized) objects that are managed by EHCache, but stored outside the heap (and also not subject to GC). As the off-heap store continues to be managed in memory, it is slightly slower than the on-heap store, but still faster than the disk store.
The internal details involved in management and usage of the off-heap store aren't very evident in the link posted in the question, so it would be wise to check out the details of Terracotta BigMemory, which is used to manage the off-disk store. BigMemory (the off-heap store) is to be used to avoid the overhead of GC on a heap that is several Megabytes or Gigabytes large. BigMemory uses the memory address space of the JVM process, via direct ByteBuffers that are not subject to GC unlike other native Java objects.
来自 http://code.google.com/p/fast-serialization/wiki/ QuickStartHeapOff
什么是堆卸载?
通常,您分配的所有非临时对象都由 java 的垃圾收集器管理。尽管虚拟机在垃圾回收方面做得不错,但在某个时刻虚拟机必须执行所谓的“Full GC”。完整 GC 涉及扫描完整分配的堆,这意味着 GC 暂停/减慢与应用程序堆大小成正比。因此,不要相信任何人告诉你“内存很便宜”。在java中,内存消耗会影响性能。此外,使用堆大小 > 时,您可能会遇到明显的停顿。 1 GB。如果您在集群或网格中进行任何近乎实时的操作,这可能会很糟糕,Java 进程可能会变得无响应并从集群中删除。
然而,当今的服务器应用程序(通常构建在臃肿的框架之上;-))很容易需要远远超过 4Gb 的堆。
解决这些内存需求的一种解决方案是将部分对象“卸载”到非 Java 堆(直接从操作系统分配)。幸运的是,java.nio 提供了直接分配/读取和写入“非托管”内存块(甚至是内存映射文件)的类。
因此,我们可以分配大量的“非托管”内存并使用它来保存对象。为了将任意对象保存到非托管内存中,最可行的解决方案是使用序列化。这意味着应用程序将对象序列化到堆外内存中,稍后可以使用反序列化来读取该对象。
java VM 管理的堆大小可以保持较小,因此 GC 暂停时间为毫秒,每个人都很高兴,工作完成了。
很明显,这种堆外缓冲区的性能主要取决于序列化实现的性能。好消息:由于某种原因,FST 序列化速度相当快:-)。
示例使用场景:
编辑:对于某些场景,人们可能会选择更复杂的垃圾收集算法,例如 ConcurrentMarkAndSweep 或 G1 来支持更大的堆(但这也有超过 16GB 堆的限制)。还有一种商业 JVM,具有改进的“无暂停”GC (Azul)。
from http://code.google.com/p/fast-serialization/wiki/QuickStartHeapOff
What is Heap-Offloading ?
Usually all non-temporary objects you allocate are managed by java's garbage collector. Although the VM does a decent job doing garbage collection, at a certain point the VM has to do a so called 'Full GC'. A full GC involves scanning the complete allocated Heap, which means GC pauses/slowdowns are proportional to an applications heap size. So don't trust any person telling you 'Memory is Cheap'. In java memory consumtion hurts performance. Additionally you may get notable pauses using heap sizes > 1 Gb. This can be nasty if you have any near-real-time stuff going on, in a cluster or grid a java process might get unresponsive and get dropped from the cluster.
However todays server applications (frequently built on top of bloaty frameworks ;-) ) easily require heaps far beyond 4Gb.
One solution to these memory requirements, is to 'offload' parts of the objects to the non-java heap (directly allocated from the OS). Fortunately java.nio provides classes to directly allocate/read and write 'unmanaged' chunks of memory (even memory mapped files).
So one can allocate large amounts of 'unmanaged' memory and use this to save objects there. In order to save arbitrary objects into unmanaged memory, the most viable solution is the use of Serialization. This means the application serializes objects into the offheap memory, later on the object can be read using deserialization.
The heap size managed by the java VM can be kept small, so GC pauses are in the millis, everybody is happy, job done.
It is clear, that the performance of such an off heap buffer depends mostly on the performance of the serialization implementation. Good news: for some reason FST-serialization is pretty fast :-).
Sample usage scenarios:
Edit: For some scenarios one might choose more sophisticated Garbage Collection algorithms such as ConcurrentMarkAndSweep or G1 to support larger heaps (but this also has its limits beyond 16GB heaps). There is also a commercial JVM with improved 'pauseless' GC (Azul) available.
堆是内存中动态分配的对象所在的位置。如果您使用了
new
,那么它就在堆上。这与堆栈空间相反,堆栈空间是函数堆栈所在的位置。如果您有一个局部变量,那么该引用位于堆栈上。Java的堆是垃圾回收的对象,并且对象可以直接使用。
EHCache 的堆外存储将常规对象从堆中取出,将其序列化,并将其作为字节存储在 EHCache 管理的内存块中。这就像将其存储到磁盘,但它仍在 RAM 中。在此状态下,对象不能直接使用,必须首先将它们反序列化。也不受垃圾收集的影响。
The heap is the place in memory where your dynamically allocated objects live. If you used
new
then it's on the heap. That's as opposed to stack space, which is where the function stack lives. If you have a local variable then that reference is on the stack.Java's heap is subject to garbage collection and the objects are usable directly.
EHCache's off-heap storage takes your regular object off the heap, serializes it, and stores it as bytes in a chunk of memory that EHCache manages. It's like storing it to disk but it's still in RAM. The objects are not directly usable in this state, they have to be deserialized first. Also not subject to garbage collection.
简而言之
图片来源
详细图片
<图片src="https://i.sstatic.net/EvQgj.jpg" alt="Java 开/关堆存储详细信息">
图片来源
In short picture
pic credits
Detailed picture
pic credits
不是 100%;然而,听起来堆是一个对象或一组分配的空间(在 RAM 上),它内置于 Java 本身的代码功能中,或者更可能是来自 ehcache 本身的功能,并且堆外 RAM 是自己的系统出色地;然而,这听起来慢了一个数量级,因为它没有那么有条理,这意味着它可能不使用堆(意味着一长组内存空间),而是使用不同的地址空间,可能会使其效率稍低。
当然,下一层是硬盘空间本身。
我不使用 ehcache,所以你可能不想相信我,但这就是我从他们的文档中收集到的内容。
Not 100%; however, it sounds like the heap is an object or set of allocated space (on RAM) that is built into the functionality of the code either Java itself or more likely functionality from ehcache itself, and the off-heap Ram is there own system as well; however, it sounds like this is one magnitude slower as it is not as organized, meaning it may not use a heap (meaning one long set of space of ram), and instead uses different address spaces likely making it slightly less efficient.
Then of course the next tier lower is hard-drive space itself.
I don't use ehcache, so you may not want to trust me, but that what is what I gathered from their documentation.
JVM 对堆外内存一无所知。 Ehcache 实现了磁盘缓存和内存缓存。
The JVM doesn't know anything about off-heap memory. Ehcache implements an on-disk cache as well as an in-memory cache.