为什么 sun.misc.Unsafe 存在,以及如何在现实世界中使用它?
前几天我偶然发现了 sun.misc.Unsafe 包,并对它的功能感到惊讶。
当然,该类没有记录,但我想知道是否有充分的理由使用它。可能会出现哪些场景需要使用它?它如何在现实场景中使用?
此外,如果您确实需要它,这是否表明您的设计可能有问题?
为什么Java还要包含这个类呢?
I came across the sun.misc.Unsafe package the other day and was amazed at what it could do.
Of course, the class is undocumented, but I was wondering if there was ever a good reason to use it. What scenarios might arise where you would need to use it? How might it be used in a real-world scenario?
Furthermore, if you do need it, does that not indicate that something is probably wrong with your design?
Why does Java even include this class?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(16)
示例
VM“内在化”。即无锁哈希表中使用的 CAS(比较和交换)
例如:sun.misc.Unsafe.compareAndSwapInt
它可以对包含 CAS 特殊指令的本机代码进行真正的 JNI 调用
在此处了解有关 CAS 的更多信息 http://en.wikipedia.org/wiki/Compare-and-swap< /a>
主机虚拟机的 sun.misc.Unsafe 功能可用于分配未初始化的对象,然后将构造函数调用解释为任何其他方法调用。
可以从本机地址跟踪数据。可以检索
使用 java.lang.Unsafe 类获取对象的内存地址,并通过不安全的 get/put 方法直接对其字段进行操作!
JVM 编译时优化。高性能VM使用“魔法”,需要低级操作。例如: http://en.wikipedia.org/wiki/Jikes_RVM
分配内存,sun.misc.Unsafe .allocateMemory 例如:- 当调用 ByteBuffer.allocateDirect 时,
跟踪调用堆栈并使用 sun.misc.Unsafe 实例化的值重放,对于检测很有用
sun.misc.Unsafe.arrayBaseOffset 和arrayIndexScale 可用于开发 arraylet,这是一种有效地将大型数组分解为较小对象的技术,以限制大型对象上扫描、更新或移动操作的实时成本
http://robaustin.wikidot.com/how-to-write-to-direct-memory-locations-in-java
有关参考的更多信息,请参阅此处 - http://bytescrolls.blogspot.com/2011/04/interesting-uses-of-sunmiscunsafe.html
examples
VM "intrinsification." ie CAS (Compare-And-Swap) used in Lock-Free Hash Tables
eg:sun.misc.Unsafe.compareAndSwapInt
it can make real JNI calls into native code that contains special instructions for CAS
read more about CAS here http://en.wikipedia.org/wiki/Compare-and-swap
The sun.misc.Unsafe functionality of the host VM can be used to allocate uninitialized objects and then interpret the constructor invocation as any other method call.
One can track the data from the native address.It is possible to retrieve an
object’s memory address using the java.lang.Unsafe class, and operate on its fields directly via unsafe get/put methods!
Compile time optimizations for JVM. HIgh performance VM using "magic", requiring low-level operations. eg: http://en.wikipedia.org/wiki/Jikes_RVM
Allocating memory, sun.misc.Unsafe.allocateMemory eg:- DirectByteBuffer constructor internally calls it when ByteBuffer.allocateDirect is invoked
Tracing the call stack and replaying with values instantiated by sun.misc.Unsafe, useful for instrumentation
sun.misc.Unsafe.arrayBaseOffset and arrayIndexScale can be used to develop arraylets,a technique for efficiently breaking up large arrays into smaller objects to limit the real-time cost of scan, update or move operations on large objects
http://robaustin.wikidot.com/how-to-write-to-direct-memory-locations-in-java
more on references here - http://bytescrolls.blogspot.com/2011/04/interesting-uses-of-sunmiscunsafe.html
仅通过运行 搜索 在一些代码搜索引擎中,我得到以下示例:
还有很多其他例子,只需点击上面的链接...
Just from running a search in some code search engine I get the following examples:
There are many other examples, just follow the above link...
有趣的是,我什至从未听说过这个课程(这可能是一件好事,真的)。
我想到的一件事是使用 Unsafe#setMemory 将包含某一点敏感信息(密码、密钥等)的缓冲区归零。您甚至可以对“不可变”对象的字段执行此操作(我再次认为普通的旧反射也可以在这里发挥作用)。不过,我不是安全专家,所以对此持保留态度。
Interesting, I'd never even heard of this class (which is probably a good thing, really).
One thing that jumps to mind is using Unsafe#setMemory to zeroize buffers that contained sensitive information at one point (passwords, keys, ...). You could even do this to fields of "immutable" objects (then again I suppose plain old reflection might do the trick here too). I'm no security expert though so take this with a grain of salt.
基于使用 eclipse 进行引用跟踪对 Java 1.6.12 库的非常简短的分析,似乎
Unsafe
的每个有用功能都以有用的方式公开。CAS 操作通过 Atomic* 类公开。
内存操作函数通过 DirectByteBuffer 公开
同步指令(park、unpark)通过 AbstractQueuedSynchronizer 公开,而 AbstractQueuedSynchronizer 又由 Lock 实现使用。
Based on a very brief analysis of the Java 1.6.12 library using eclipse for reference tracing, it seems as though every useful functionality of
Unsafe
is exposed in useful ways.CAS operations are exposed through the Atomic* classes.
Memory manipulations functions are exposed through DirectByteBuffer
Sync instructions (park,unpark) are exposed through the AbstractQueuedSynchronizer which in turn is used by Lock implementations.
Unsafe.throwException - 允许抛出已检查的异常而不声明它们。
这在某些处理反射或 AOP 的情况下很有用。
假设您为用户定义的接口构建通用代理。用户只需在接口中声明异常即可指定在特殊情况下实现抛出哪个异常。那么这是我知道的唯一方法,在接口的动态实现中引发检查异常。
Unsafe.throwException - allows to throw checked exception without declaring them.
This is useful in some cases where you deal with reflection or AOP.
Assume you Build a generic proxy for a user defined Interface. And the user can specify which exception is thrown by the implmentation in a special case just by declaring the exception in the interface. Then this is the only way I know, to rise a checked exception in the Dynamic Implementation of the Interface.
它的一种用途是在 java.util.concurrent.atomic 类中:
One use of it is in
java.util.concurrent.atomic
classes:为了高效的内存复制(至少对于短块来说复制速度比 System.arraycopy() 更快); Java LZF 和 Snappy 使用> 编解码器。他们使用“getLong”和“putLong”,这比逐字节复制要快;在复制 16/32/64 字节块等内容时特别有效。
For efficient memory copy (faster to copy than System.arraycopy() for short blocks at least); as used by Java LZF and Snappy codecs. They use 'getLong' and 'putLong', which are faster than doing copies byte-by-byte; especially efficient when copying things like 16/32/64 byte blocks.
我最近致力于重新实现 JVM,发现数量惊人的类是用
Unsafe
实现的。该类主要是为 Java 库实现者设计的,并且包含从根本上不安全但对于构建快速原语所必需的功能。例如,有获取和写入原始字段偏移量、使用硬件级同步、分配和释放内存等方法。它不适合普通 Java 程序员使用;它没有文档记录,特定于实现,并且本质上不安全(因此得名!)。此外,我认为SecurityManager
在几乎所有情况下都将禁止访问它。简而言之,它的存在主要是为了允许库实现者访问底层机器,而不必在某些类(例如
AtomicInteger
本机)中声明每个方法。您不需要在常规 Java 编程中使用或担心它,因为重点是使其余库足够快,以便您不需要这种访问。I was recently working on reimplementing the JVM and found that a surprising number of classes are implemented in terms of
Unsafe
. The class is mostly designed for the Java library implementers and contains features that are fundamentally unsafe but necessary for building fast primitives. For example, there are methods for getting and writing raw field offsets, using hardware-level synchronization, allocating and freeing memory, etc. It is not intended to be used by normal Java programmers; it's undocumented, implementation-specific, and inherently unsafe (hence the name!). Moreover, I think that theSecurityManager
will disallow access to it in almost all cases.In short, it mainly exists to allow library implementers access to the underlying machine without having to declare every method in certain classes like
AtomicInteger
native. You shouldn't need to use or worry about it in routine Java programming, as the whole point is to make the rest of the libraries fast enough that you wouldn't need that sort of access.堆外收集对于分配大量内存并在使用后立即释放内存而不产生 GC 干扰可能很有用。我编写了一个 库,用于基于
sun.misc 处理堆外数组/列表.不安全
。Off-heap collections may be useful for allocating huge amounts of memory and deallocating it immediately after use without GC interference. I wrote a library for working with off-heap arrays/lists based on
sun.misc.Unsafe
.使用它可以有效地访问和分配大量内存,例如在您自己的体素引擎中! (即 Minecraft 风格的游戏。)
根据我的经验,JVM 通常无法在您真正需要的地方消除边界检查。例如,如果您正在迭代一个大型数组,但实际的内存访问隐藏在循环中的非 virtual* 方法调用下方,则 JVM 仍可能对每个数组访问执行边界检查,而不是在之前执行一次循环。因此,为了获得潜在的巨大性能提升,您可以通过使用 sun.misc.Unsafe 直接访问内存的方法来消除循环内的 JVM 边界检查,确保在正确的位置自行执行任何边界检查。 (您要在某种程度上进行边界检查,对吧?)
*通过非虚拟,我的意思是 JVM 不必动态解析您的特定方法是什么,因为您已经正确保证类/方法/实例是 static/final/what-have- 的某种组合 对于我自己
开发的体素引擎,这在块生成和序列化期间(在我一次读取/写入整个数组的地方)带来了显着的性能提升。结果可能会有所不同,但如果您的问题是缺乏边界消除,那么这将解决它。
这存在一些潜在的主要问题:具体来说,当您向接口的客户端提供无需边界检查即可访问内存的能力时,他们可能会滥用它。 (不要忘记,黑客也可以是您界面的客户端......尤其是在用 Java 编写的体素引擎的情况下。)因此,您应该以一种不会滥用内存访问的方式设计界面,或者您应该非常小心地验证用户数据,以免其永远与您的危险界面混合。考虑到黑客可能通过未经检查的内存访问造成灾难性的后果,因此最好同时采用这两种方法。
Use it to access and allocate large amounts of memory efficiently, such as in your very own voxel engine! (i.e. Minecraft-style game.)
In my experience, the JVM is often unable to eliminate bounds-checking in place you truly need it. For example, if you're iterating over a large array, but the actual memory access is tucked underneath a non-virtual* method call in the loop, the JVM may still perform a bounds check with each array access, rather than once just before the loop. Thus, for potentially large performance gains, you can eliminate JVM bounds-checking inside the loop via a method which employs sun.misc.Unsafe to access the memory directly, making sure to do any bounds-checking yourself at the correct places. (You are gonna bounds check at some level, right?)
*by non-virtual, I mean the JVM shouldn't have to dynamically resolve whatever your particular method is, because you've correctly guaranteed that class/method/instance are some combination of static/final/what-have-you.
For my home-grown voxel engine, this resulted in a dramatic performance gain during chunk generation and serialization (iow places where I was reading/writing to the entire array at once). Results may vary, but if a lack of bounds-elimination is your problem, then this will fix it.
There are some potentially major problems with this: specifically, when you provide the ability to access memory without bounds-checking to clients of your interface, they will probably abuse it. (Don't forget that hackers can also be clients of your interface... especially in the case of a voxel engine written in Java.) Thus, you should either design your interface in a way such that memory access cannot be abused, or you should be extremely careful to validate user-data before it can ever, ever mingle with your dangerous interface. Considering the catastrophic things a hacker can do with unchecked memory access, it's probably best to take both approaches.
我们使用 Unsafe 实现了像 Arrays、HashMaps、TreeMaps 这样的巨大集合。
为了避免/最小化碎片,我们使用 dlmalloc 不安全。
这帮助我们获得了并发性能。
We have implemented huge collections like Arrays,HashMaps,TreeMaps using Unsafe.
And to avoid/minimize the fragmentation, we implemented memory allocator using the concepts of dlmalloc over unsafe.
This helped us to gain the performance in concurrency.
Unsafe.park()
和Unsafe.unpark()
用于构建自定义并发控制结构和协作调度机制。Unsafe.park()
andUnsafe.unpark()
for the construction of custom concurrency control structures and cooperative scheduling mechanisms.我自己没有使用过它,但我想如果你有一个变量只是偶尔被多个线程读取(所以你真的不想让它变得易失)你可以使用 putObjectVolatile 在主线程中写入它时,以及从其他线程进行罕见读取时。
Haven't used it myself, but I suppose if you have a variable that is only occasionally read by more than one thread (so you don't really want to make it volatile) you could use the
putObjectVolatile
when writing it in the main thread andreadObjectVolatile
when doing the rare reads from other threads.如果您需要替换当前使用它的类之一提供的功能,则需要它。
这可以是自定义/更快/更紧凑的序列化/反序列化、更快/更大的缓冲区/可调整大小的 ByteBuffer 版本,或者添加原子变量(例如当前不支持的变量)。
我曾经用过它来完成所有这些工作。
You need it if you need to replace functionality provided by one of the classes which uses it currently.
This can be custom/faster/more compact serialization/deserialization, a faster/larger buffer/resizable version of ByteBuffer, or adding an atomic variable e.g. one not supported currently.
I have used it for all of these at some time.
其使用的一个示例是 random 方法,该方法 调用 unsafe 来更改种子。
该网站也有一些它的用途。
One example of its use is the random method, which calls the unsafe to change the seed.
This site also has also some uses of it.
该对象似乎可以在比 Java 代码通常允许的级别更低的级别上工作。如果您正在编写高级应用程序,那么 JVM 会将内存处理和其他操作从代码级别中抽象出来,以便更容易编程。通过使用 Unsafe 库,您可以有效地完成通常会为您完成的低级操作。
正如 woliveirajr 所说,“random()”使用 Unsafe 来播种,就像许多其他操作将使用 Unsafe 中包含的 allocateMemory() 函数一样。
作为一名程序员,您可能永远不需要这个库,但对低级元素的严格控制确实会派上用场(这就是为什么在主要产品中仍然存在汇编和(在较小程度上)C 代码)
The object appears to be availability to work at a lower level than what Java code typically allows for. If you're coding a high level application then the JVM abstracts memory handling and other operations away from the code level so its easier to program. By using the Unsafe library you're effectively completing low-level operations that would typically be done for you.
As woliveirajr stated "random()" uses Unsafe to seed just as many other operations will use the allocateMemory() function included in Unsafe.
As a programmer you probably could get away with never needing this library but having strict control over low-level elements does come in handy (that's why there is still Assembly and (to a lesser extent) C code drifting around in major products)