Java 的 ThreadLocal 底层是如何实现的?
ThreadLocal是如何实现的? 它是用 Java 实现的(使用一些从 ThreadID 到对象的并发映射),还是使用一些 JVM 钩子来更有效地完成它?
How is ThreadLocal implemented? Is it implemented in Java (using some concurrent map from ThreadID to object), or does it use some JVM hook to do it more efficiently?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这里的所有答案都是正确的,但有点令人失望,因为它们在某种程度上掩盖了 ThreadLocal 的实现是多么聪明。 我只是在看 ThreadLocal 的源代码,并对它的实现方式印象深刻。
朴素实现
如果我要求您根据 javadoc 中描述的 API 实现一个
ThreadLocal
类,您会怎么做? 初始实现可能是使用Thread.currentThread()
作为其键的ConcurrentHashMap
。 这将工作得相当好,但也有一些缺点。GC 友好的实现
好吧,再试一次,让我们使用 来处理垃圾回收问题弱引用。 处理 WeakReferences 可能会令人困惑,但使用像这样构建的映射应该足够了:
或者如果我们使用 Guava (我们应该如此!):
这意味着一旦没有其他人持有该线程(意味着它已完成)键/值可以被垃圾收集,这是一个改进,但仍然没有解决线程争用问题,这意味着到目前为止我们的 ThreadLocal 并不是一个令人惊奇的类。 此外,如果有人决定在完成后保留 Thread 对象,那么它们永远不会被 GC 回收,因此我们的对象也不会被回收,即使它们现在在技术上是无法访问的。
巧妙的实现
我们一直将
ThreadLocal
视为线程到值的映射,但这实际上可能不是正确的思考方式。 如果我们不将其视为从 Threads 到每个 ThreadLocal 对象中的值的映射,而是将其视为 ThreadLocal 对象到每个 Thread 中的值的映射,会怎么样? 如果每个线程都存储映射,并且 ThreadLocal 只是为该映射提供一个很好的接口,那么我们就可以避免以前实现的所有问题。一种实现看起来像这样:
这里无需担心并发性,因为只有一个线程会访问此映射。
Java 开发人员比我们有一个主要优势 - 他们可以直接开发 Thread 类并向其添加字段和操作,而这正是他们所做的。
在
java.lang.Thread
有以下几行:正如评论所暗示的那样,这确实是该 Thread 的 ThreadLocal 对象跟踪的所有值的包私有映射。
ThreadLocalMap
的实现不是WeakHashMap
,但它遵循相同的基本契约,包括通过弱引用保存其键。然后实现 ThreadLocal.get() ,如下所示:
和 ThreadLocal.setInitialValue() 一样,如下所示:
本质上,使用此线程中的映射来保存我们所有的 ThreadLocal 对象。 这样,我们就不需要担心其他线程中的值(ThreadLocal实际上只能访问当前线程中的值),因此不存在并发问题。 此外,一旦
Thread
完成,其映射将自动被GC,并且所有本地对象将被清理。 即使Thread
被持有,ThreadLocal
对象也是通过弱引用持有的,并且一旦ThreadLocal
对象消失就可以被清除超出范围。不用说,这个实现给我留下了深刻的印象,它非常优雅地解决了很多并发问题(诚然,通过利用作为核心 Java 的一部分的优势,但这是可以原谅的,因为它是一个如此聪明的类),并且允许快速和对一次只需要一个线程访问的对象进行线程安全访问。
tl;dr
ThreadLocal
的实现非常酷,而且比您乍一看可能想象的更快/更智能。如果您喜欢这个答案,您可能还会欣赏我(不太详细)对
ThreadLocalRandom
的讨论。Thread
/ThreadLocal
代码片段取自 Oracle/OpenJDK 对 Java 8 的实现。All of the answers here are correct, but a little disappointing as they somewhat gloss over how clever
ThreadLocal
's implementation is. I was just looking at the source code forThreadLocal
and was pleasantly impressed by how it's implemented.The Naive Implementation
If I asked you to implement a
ThreadLocal<T>
class given the API described in the javadoc, what would you do? An initial implementation would likely be aConcurrentHashMap<Thread,T>
usingThread.currentThread()
as its key. This will would work reasonably well but does have some disadvantages.ConcurrentHashMap
is a pretty smart class, but it ultimately still has to deal with preventing multiple threads from mucking with it in any way, and if different threads hit it regularly, there will be slowdowns.The GC-friendly Implementation
Ok try again, lets deal with the garbage collection issue by using weak references. Dealing with WeakReferences can be confusing, but it should be sufficient to use a map built like so:
Or if we're using Guava (and we should be!):
This means once no one else is holding onto the Thread (implying it's finished) the key/value can be garbage collected, which is an improvement, but still doesn't address the thread contention issue, meaning so far our
ThreadLocal
isn't all that amazing of a class. Furthermore, if someone decided to hold ontoThread
objects after they'd finished, they'd never be GC'ed, and therefore neither would our objects, even though they're technically unreachable now.The Clever Implementation
We've been thinking about
ThreadLocal
as a mapping of threads to values, but maybe that's not actually the right way to think about it. Instead of thinking of it as a mapping from Threads to values in each ThreadLocal object, what if we thought about it as a mapping of ThreadLocal objects to values in each Thread? If each thread stores the mapping, and ThreadLocal merely provides a nice interface into that mapping, we can avoid all of the issues of the previous implementations.An implementation would look something like this:
There's no need to worry about concurrency here, because only one thread will ever be accessing this map.
The Java devs have a major advantage over us here - they can directly develop the Thread class and add fields and operations to it, and that's exactly what they've done.
In
java.lang.Thread
there's the following lines:Which as the comment suggests is indeed a package-private mapping of all values being tracked by
ThreadLocal
objects for thisThread
. The implementation ofThreadLocalMap
is not aWeakHashMap
, but it follows the same basic contract, including holding its keys by weak reference.ThreadLocal.get()
is then implemented like so:And
ThreadLocal.setInitialValue()
like so:Essentially, use a map in this Thread to hold all our
ThreadLocal
objects. This way, we never need to worry about the values in other Threads (ThreadLocal
literally can only access the values in the current Thread) and therefore have no concurrency issues. Furthermore, once theThread
is done, its map will automatically be GC'ed and all the local objects will be cleaned up. Even if theThread
is held onto, theThreadLocal
objects are held by weak reference, and can be cleaned up as soon as theThreadLocal
object goes out of scope.Needless to say, I was rather impressed by this implementation, it quite elegantly gets around a lot of concurrency issues (admittedly by taking advantage of being part of core Java, but that's forgivable them since it's such a clever class) and allows for fast and thread-safe access to objects that only need to be accessed by one thread at a time.
tl;dr
ThreadLocal
's implementation is pretty cool, and much faster/smarter than you might think at first glance.If you liked this answer you might also appreciate my (less detailed) discussion of
ThreadLocalRandom
.Thread
/ThreadLocal
code snippets taken from Oracle/OpenJDK's implementation of Java 8.你的意思是
java.lang.ThreadLocal
。 实际上,它非常简单,它只是存储在每个 Thread 对象内的名称/值对的 Map(请参阅 Thread.threadLocals 字段)。 API 隐藏了实现细节,但这或多或少就是它的全部内容。You mean
java.lang.ThreadLocal
. It's quite simple, really, it's just a Map of name-value pairs stored inside eachThread
object (see theThread.threadLocals
field). The API hides that implementation detail, but that's more or less all there is to it.Java 中的 ThreadLocal 变量通过访问 Thread.currentThread() 实例保存的 HashMap 来工作。
ThreadLocal variables in Java works by accessing a HashMap held by the Thread.currentThread() instance.
假设您要实现 ThreadLocal,如何使其成为线程特定的? 当然,最简单的方法是在Thread类中创建一个非静态字段,我们称之为
threadLocals
。 因为每个线程都由一个线程实例表示,所以每个线程中的 threadLocals 也会不同。 这也是 Java 所做的:这里的 ThreadLocal.ThreadLocalMap 是什么? 因为你只有一个线程的
threadLocals
,所以如果你简单地将threadLocals
作为你的ThreadLocal
(比如说,将threadLocals定义为Integer
),对于特定线程,您将只有一个ThreadLocal
。 如果您想要一个线程有多个 ThreadLocal 变量怎么办? 最简单的方法就是将threadLocals
做成一个HashMap
,每个条目的key
就是ThreadLocal
变量的名字,每个条目的value
是ThreadLocal
变量的值。 有点混乱? 假设我们有两个线程,t1
和t2
。 它们采用相同的 Runnable 实例作为 Thread 构造函数的参数,并且它们都有两个名为 tlA 的 ThreadLocal 变量和tlb。 事情就是这样的。t1.tlA
t2.tlB
请注意,这些值是我编的。
现在看来很完美。 但是 ThreadLocal.ThreadLocalMap 是什么? 为什么不直接使用
HashMap
? 为了解决这个问题,我们来看看当我们通过ThreadLocal
类的set(T value)
方法设置一个值时会发生什么:getMap(t) 只是返回 t.threadLocals
。 因为t.threadLocals
被初始化为null
,所以我们首先输入createMap(t, value)
:它创建了一个新的
ThreadLocalMap< /code> 实例使用当前 ThreadLocal 实例和要设置的值。 我们来看看
ThreadLocalMap
是什么样的,它其实是ThreadLocal
类的一部分ThreadLocalMap
类的核心部分是Entry类
,它扩展了WeakReference
。 它确保如果当前线程退出,它将被自动垃圾收集。 这就是为什么它使用 ThreadLocalMap 而不是简单的 HashMap。 它将当前的ThreadLocal
及其值作为Entry
类的参数传递,因此当我们想要获取该值时,我们可以从table
中获取code>,它是Entry
类的一个实例:整个图片是这样的:
Suppose you're going to implement
ThreadLocal
, how do you make it thread-specific? Of course the simplest method is to create a non-static field in the Thread class, let's call itthreadLocals
. Because each thread is represented by a thread instance, sothreadLocals
in every thread would be different, too. And this is also what Java does:What is
ThreadLocal.ThreadLocalMap
here? Because you only have athreadLocals
for a thread, so if you simply takethreadLocals
as yourThreadLocal
(say, define threadLocals asInteger
), you will only have oneThreadLocal
for a specific thread. What if you want multipleThreadLocal
variables for a thread? The simplest way is to makethreadLocals
aHashMap
, thekey
of each entry is the name of theThreadLocal
variable, and thevalue
of each entry is the value of theThreadLocal
variable. A little confusing? Let's say we have two threads,t1
andt2
. they take the sameRunnable
instance as the parameter ofThread
constructor, and they both have twoThreadLocal
variables namedtlA
andtlb
. This is what it's like.t1.tlA
t2.tlB
Notice that the values are made up by me.
Now it seems perfect. But what is
ThreadLocal.ThreadLocalMap
? Why didn't it just useHashMap
? To solve the problem, let's see what happens when we set a value through theset(T value)
method of theThreadLocal
class:getMap(t)
simply returnst.threadLocals
. Becauset.threadLocals
was initilized tonull
, so we entercreateMap(t, value)
first:It creates a new
ThreadLocalMap
instance using the currentThreadLocal
instance and the value to be set. Let's see whatThreadLocalMap
is like, it's in fact part of theThreadLocal
classThe core part of the
ThreadLocalMap
class is theEntry class
, which extendsWeakReference
. It ensures that if the current thread exits, it will be garbage collected automatically. This is why it usesThreadLocalMap
instead of a simpleHashMap
. It passes the currentThreadLocal
and its value as the parameter of theEntry
class, so when we want to get the value, we could get it fromtable
, which is an instance of theEntry
class:This is what is like in the whole picture:
从概念上讲,您可以将
ThreadLocal
视为持有一个存储线程特定值的Map
,尽管实际情况并非如此实施的。线程特定值存储在 Thread 对象本身中; 当线程终止时,线程特定的值可以被垃圾收集。
参考:JCIP
Conceptually, you can think of a
ThreadLocal<T>
as holding aMap<Thread,T>
that stores the thread-specific values, though this is not how it is actually implemented.The thread-specific values are stored in the Thread object itself; when the thread terminates, the thread-specific values can be garbage collected.
Reference : JCIP