Java 的 ThreadLocal 底层是如何实现的?

发布于 2024-07-29 01:28:54 字数 87 浏览 6 评论 0原文

ThreadLocal是如何实现的? 它是用 Java 实现的(使用一些从 ThreadID 到对象的并发映射),还是使用一些 JVM 钩子来更有效地完成它?

How is ThreadLocal implemented? Is it implemented in Java (using some concurrent map from ThreadID to object), or does it use some JVM hook to do it more efficiently?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

苏璃陌 2024-08-05 01:28:54

这里的所有答案都是正确的,但有点令人失望,因为它们在某种程度上掩盖了 ThreadLocal 的实现是多么聪明。 我只是在看 ThreadLocal 的源代码,并对它的实现方式印象深刻。

朴素实现

如果我要求您根据 javadoc 中描述的 API 实现一个 ThreadLocal 类,您会怎么做? 初始实现可能是使用 Thread.currentThread() 作为其键的 ConcurrentHashMap。 这将工作得相当好,但也有一些缺点。

  • 线程争用 - ConcurrentHashMap 是一个非常智能的类,但它最终仍然必须处理防止多个线程以任何方式破坏它的问题,并且如果不同的线程定期命中它,则会出现速度减慢的情况。
  • 永久保留指向线程和对象的指针,即使线程已完成并且可以被 GC 后也是如此。

GC 友好的实现

好吧,再试一次,让我们使用 来处理垃圾回收问题弱引用。 处理 WeakReferences 可能会令人困惑,但使用像这样构建的映射应该足够了:

 Collections.synchronizedMap(new WeakHashMap<Thread, T>())

或者如果我们使用 Guava (我们应该如此!):

new MapMaker().weakKeys().makeMap()

这意味着一旦没有其他人持有该线程(意味着它已完成)键/值可以被垃圾收集,这是一个改进,但仍然没有解决线程争用问题,这意味着到目前为止我们的 ThreadLocal 并不是一个令人惊奇的类。 此外,如果有人决定在完成后保留 Thread 对象,那么它们永远不会被 GC 回收,因此我们的对象也不会被回收,即使它们现在在技术上是无法访问的。

巧妙的实现

我们一直将ThreadLocal视为线程到值的映射,但这实际上可能不是正确的思考方式。 如果我们不将其视为从 Threads 到每个 ThreadLocal 对象中的值的映射,而是将其视为 ThreadLocal 对象到每个 Thread 中的值的映射,会怎么样? 如果每个线程都存储映射,并且 ThreadLocal 只是为该映射提供一个很好的接口,那么我们就可以避免以前实现的所有问题。

一种实现看起来像这样:

// called for each thread, and updated by the ThreadLocal instance
new WeakHashMap<ThreadLocal,T>()

这里无需担心并发性,因为只有一个线程会访问此映射。

Java 开发人员比我们有一个主要优势 - 他们可以直接开发 Thread 类并向其添加字段和操作,而这正是他们所做的。

java.lang.Thread 有以下几行:

/* 与该线程相关的 ThreadLocal 值。   此地图已维护 
   * 通过 ThreadLocal 类。   */ 
  ThreadLocal.ThreadLocalMap threadLocals = null; 
  

正如评论所暗示的那样,这确实是该 Thread 的 ThreadLocal 对象跟踪的所有值的包私有映射。 ThreadLocalMap 的实现不是 WeakHashMap,但它遵循相同的基本契约,包括通过弱引用保存其键。

然后实现 ThreadLocal.get() ,如下所示:

public T get() { 
      线程 t = Thread.currentThread(); 
      ThreadLocalMap 映射 = getMap(t); 
      如果(地图!=空){ 
          ThreadLocalMap.Entry e = map.getEntry(this); 
          如果(e!= null){ 
              @SuppressWarnings(“未选中”) 
              T 结果 = (T)e.value; 
              返回结果; 
          } 
      } 
      返回 setInitialValue(); 
  } 
  

和 ThreadLocal.setInitialValue() 一样,如下所示:

私有 T setInitialValue() { 
      T值=初始值(); 
      线程 t = Thread.currentThread(); 
      ThreadLocalMap 地图 = getMap(t); 
      如果(地图!=空) 
          地图.设置(这个,值); 
      别的 
          创建映射(t,值); 
      返回值; 
  } 
  

本质上,使用此线程中的映射来保存我们所有的 ThreadLocal 对象。 这样,我们就不需要担心其他线程中的值(ThreadLocal实际上只能访问当前线程中的值),因此不存在并发问题。 此外,一旦Thread完成,其映射将自动被GC,并且所有本地对象将被清理。 即使 Thread 被持有,ThreadLocal 对象也是通过弱引用持有的,并且一旦 ThreadLocal 对象消失就可以被清除超出范围。


不用说,这个实现给我留下了深刻的印象,它非常优雅地解决了很多并发问题(诚然,通过利用作为核心 Java 的一部分的优势,但这是可以原谅的,因为它是一个如此聪明的类),并且允许快速和对一次只需要一个线程访问的对象进行线程安全访问。

tl;dr ThreadLocal 的实现非常酷,而且比您乍一看可能想象的更快/更智能。

如果您喜欢这个答案,您可能还会欣赏我(不太详细)ThreadLocalRandom 的讨论。

Thread/ThreadLocal 代码片段取自 Oracle/OpenJDK 对 Java 8 的实现

All of the answers here are correct, but a little disappointing as they somewhat gloss over how clever ThreadLocal's implementation is. I was just looking at the source code for ThreadLocal and was pleasantly impressed by how it's implemented.

The Naive Implementation

If I asked you to implement a ThreadLocal<T> class given the API described in the javadoc, what would you do? An initial implementation would likely be a ConcurrentHashMap<Thread,T> using Thread.currentThread() as its key. This will would work reasonably well but does have some disadvantages.

  • Thread contention - ConcurrentHashMap is a pretty smart class, but it ultimately still has to deal with preventing multiple threads from mucking with it in any way, and if different threads hit it regularly, there will be slowdowns.
  • Permanently keeps a pointer to both the Thread and the object, even after the Thread has finished and could be GC'ed.

The GC-friendly Implementation

Ok try again, lets deal with the garbage collection issue by using weak references. Dealing with WeakReferences can be confusing, but it should be sufficient to use a map built like so:

 Collections.synchronizedMap(new WeakHashMap<Thread, T>())

Or if we're using Guava (and we should be!):

new MapMaker().weakKeys().makeMap()

This means once no one else is holding onto the Thread (implying it's finished) the key/value can be garbage collected, which is an improvement, but still doesn't address the thread contention issue, meaning so far our ThreadLocal isn't all that amazing of a class. Furthermore, if someone decided to hold onto Thread objects after they'd finished, they'd never be GC'ed, and therefore neither would our objects, even though they're technically unreachable now.

The Clever Implementation

We've been thinking about ThreadLocal as a mapping of threads to values, but maybe that's not actually the right way to think about it. Instead of thinking of it as a mapping from Threads to values in each ThreadLocal object, what if we thought about it as a mapping of ThreadLocal objects to values in each Thread? If each thread stores the mapping, and ThreadLocal merely provides a nice interface into that mapping, we can avoid all of the issues of the previous implementations.

An implementation would look something like this:

// called for each thread, and updated by the ThreadLocal instance
new WeakHashMap<ThreadLocal,T>()

There's no need to worry about concurrency here, because only one thread will ever be accessing this map.

The Java devs have a major advantage over us here - they can directly develop the Thread class and add fields and operations to it, and that's exactly what they've done.

In java.lang.Thread there's the following lines:

/* ThreadLocal values pertaining to this thread. This map is maintained
 * by the ThreadLocal class. */
ThreadLocal.ThreadLocalMap threadLocals = null;

Which as the comment suggests is indeed a package-private mapping of all values being tracked by ThreadLocal objects for this Thread. The implementation of ThreadLocalMap is not a WeakHashMap, but it follows the same basic contract, including holding its keys by weak reference.

ThreadLocal.get() is then implemented like so:

public T get() {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        ThreadLocalMap.Entry e = map.getEntry(this);
        if (e != null) {
            @SuppressWarnings("unchecked")
            T result = (T)e.value;
            return result;
        }
    }
    return setInitialValue();
}

And ThreadLocal.setInitialValue() like so:

private T setInitialValue() {
    T value = initialValue();
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null)
        map.set(this, value);
    else
        createMap(t, value);
    return value;
}

Essentially, use a map in this Thread to hold all our ThreadLocal objects. This way, we never need to worry about the values in other Threads (ThreadLocal literally can only access the values in the current Thread) and therefore have no concurrency issues. Furthermore, once the Thread is done, its map will automatically be GC'ed and all the local objects will be cleaned up. Even if the Thread is held onto, the ThreadLocal objects are held by weak reference, and can be cleaned up as soon as the ThreadLocal object goes out of scope.


Needless to say, I was rather impressed by this implementation, it quite elegantly gets around a lot of concurrency issues (admittedly by taking advantage of being part of core Java, but that's forgivable them since it's such a clever class) and allows for fast and thread-safe access to objects that only need to be accessed by one thread at a time.

tl;dr ThreadLocal's implementation is pretty cool, and much faster/smarter than you might think at first glance.

If you liked this answer you might also appreciate my (less detailed) discussion of ThreadLocalRandom.

Thread/ThreadLocal code snippets taken from Oracle/OpenJDK's implementation of Java 8.

计㈡愣 2024-08-05 01:28:54

你的意思是java.lang.ThreadLocal。 实际上,它非常简单,它只是存储在每个 Thread 对象内的名称/值对的 Map(请参阅 Thread.threadLocals 字段)。 API 隐藏了实现细节,但这或多或少就是它的全部内容。

You mean java.lang.ThreadLocal. It's quite simple, really, it's just a Map of name-value pairs stored inside each Thread object (see the Thread.threadLocals field). The API hides that implementation detail, but that's more or less all there is to it.

猛虎独行 2024-08-05 01:28:54

Java 中的 ThreadLocal 变量通过访问 Thread.currentThread() 实例保存的 HashMap 来工作。

ThreadLocal variables in Java works by accessing a HashMap held by the Thread.currentThread() instance.

沉溺在你眼里的海 2024-08-05 01:28:54

假设您要实现 ThreadLocal,如何使其成为线程特定的? 当然,最简单的方法是在Thread类中创建一个非静态字段,我们称之为threadLocals。 因为每个线程都由一个线程实例表示,所以每个线程中的 threadLocals 也会不同。 这也是 Java 所做的:

/* ThreadLocal values pertaining to this thread. This map is maintained
* by the ThreadLocal class. */
ThreadLocal.ThreadLocalMap threadLocals = null;

这里的 ThreadLocal.ThreadLocalMap 是什么? 因为你只有一个线程的threadLocals,所以如果你简单地将threadLocals作为你的ThreadLocal(比如说,将threadLocals定义为Integer),对于特定线程,您将只有一个ThreadLocal。 如果您想要一个线程有多个 ThreadLocal 变量怎么办? 最简单的方法就是将threadLocals做成一个HashMap,每个条目的key就是ThreadLocal变量的名字,每个条目的 valueThreadLocal 变量的值。 有点混乱? 假设我们有两个线程,t1t2。 它们采用相同的 Runnable 实例作为 Thread 构造函数的参数,并且它们都有两个名为 tlA 的 ThreadLocal 变量和tlb。 事情就是这样的。

t1.tlA

+-----+-------+
| Key | Value |
+-----+-------+
| tlA |     0 |
| tlB |     1 |
+-----+-------+

t2.tlB

+-----+-------+
| Key | Value |
+-----+-------+
| tlA |     2 |
| tlB |     3 |
+-----+-------+

请注意,这些值是我编的。

现在看来很完美。 但是 ThreadLocal.ThreadLocalMap 是什么? 为什么不直接使用HashMap? 为了解决这个问题,我们来看看当我们通过ThreadLocal类的set(T value)方法设置一个值时会发生什么:

public void set(T value) {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null)
        map.set(this, value);
    else
        createMap(t, value);
}

getMap(t) 只是返回 t.threadLocals。 因为t.threadLocals被初始化为null,所以我们首先输入createMap(t, value)

void createMap(Thread t, T firstValue) {
    t.threadLocals = new ThreadLocalMap(this, firstValue);
}

它创建了一个新的ThreadLocalMap< /code> 实例使用当前 ThreadLocal 实例和要设置的值。 我们来看看ThreadLocalMap是什么样的,它其实是ThreadLocal类的一部分

static class ThreadLocalMap {

    /**
     * The entries in this hash map extend WeakReference, using
     * its main ref field as the key (which is always a
     * ThreadLocal object).  Note that null keys (i.e. entry.get()
     * == null) mean that the key is no longer referenced, so the
     * entry can be expunged from table.  Such entries are referred to
     * as "stale entries" in the code that follows.
     */
    static class Entry extends WeakReference<ThreadLocal<?>> {
        /** The value associated with this ThreadLocal. */
        Object value;

        Entry(ThreadLocal<?> k, Object v) {
            super(k);
            value = v;
        }
    }

    ...

    /**
     * Construct a new map initially containing (firstKey, firstValue).
     * ThreadLocalMaps are constructed lazily, so we only create
     * one when we have at least one entry to put in it.
     */
    ThreadLocalMap(ThreadLocal<?> firstKey, Object firstValue) {
        table = new Entry[INITIAL_CAPACITY];
        int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);
        table[i] = new Entry(firstKey, firstValue);
        size = 1;
        setThreshold(INITIAL_CAPACITY);
    }

    ...

}

ThreadLocalMap类的核心部分是Entry类,它扩展了WeakReference。 它确保如果当前线程退出,它将被自动垃圾收集。 这就是为什么它使用 ThreadLocalMap 而不是简单的 HashMap。 它将当前的ThreadLocal及其值作为Entry类的参数传递,因此当我们想要获取该值时,我们可以从table中获取code>,它是 Entry 类的一个实例:

public T get() {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        ThreadLocalMap.Entry e = map.getEntry(this);
        if (e != null) {
            @SuppressWarnings("unchecked")
            T result = (T)e.value;
            return result;
        }
    }
    return setInitialValue();
}

整个图片是这样的:

整体情况

Suppose you're going to implement ThreadLocal, how do you make it thread-specific? Of course the simplest method is to create a non-static field in the Thread class, let's call it threadLocals. Because each thread is represented by a thread instance, so threadLocals in every thread would be different, too. And this is also what Java does:

/* ThreadLocal values pertaining to this thread. This map is maintained
* by the ThreadLocal class. */
ThreadLocal.ThreadLocalMap threadLocals = null;

What is ThreadLocal.ThreadLocalMap here? Because you only have a threadLocals for a thread, so if you simply take threadLocals as your ThreadLocal(say, define threadLocals as Integer), you will only have one ThreadLocal for a specific thread. What if you want multiple ThreadLocal variables for a thread? The simplest way is to make threadLocals a HashMap, the key of each entry is the name of the ThreadLocal variable, and the value of each entry is the value of the ThreadLocal variable. A little confusing? Let's say we have two threads, t1 and t2. they take the same Runnable instance as the parameter of Thread constructor, and they both have two ThreadLocal variables named tlA and tlb. This is what it's like.

t1.tlA

+-----+-------+
| Key | Value |
+-----+-------+
| tlA |     0 |
| tlB |     1 |
+-----+-------+

t2.tlB

+-----+-------+
| Key | Value |
+-----+-------+
| tlA |     2 |
| tlB |     3 |
+-----+-------+

Notice that the values are made up by me.

Now it seems perfect. But what is ThreadLocal.ThreadLocalMap? Why didn't it just use HashMap? To solve the problem, let's see what happens when we set a value through the set(T value) method of the ThreadLocal class:

public void set(T value) {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null)
        map.set(this, value);
    else
        createMap(t, value);
}

getMap(t) simply returns t.threadLocals. Because t.threadLocals was initilized to null, so we enter createMap(t, value) first:

void createMap(Thread t, T firstValue) {
    t.threadLocals = new ThreadLocalMap(this, firstValue);
}

It creates a new ThreadLocalMap instance using the current ThreadLocal instance and the value to be set. Let's see what ThreadLocalMap is like, it's in fact part of the ThreadLocal class

static class ThreadLocalMap {

    /**
     * The entries in this hash map extend WeakReference, using
     * its main ref field as the key (which is always a
     * ThreadLocal object).  Note that null keys (i.e. entry.get()
     * == null) mean that the key is no longer referenced, so the
     * entry can be expunged from table.  Such entries are referred to
     * as "stale entries" in the code that follows.
     */
    static class Entry extends WeakReference<ThreadLocal<?>> {
        /** The value associated with this ThreadLocal. */
        Object value;

        Entry(ThreadLocal<?> k, Object v) {
            super(k);
            value = v;
        }
    }

    ...

    /**
     * Construct a new map initially containing (firstKey, firstValue).
     * ThreadLocalMaps are constructed lazily, so we only create
     * one when we have at least one entry to put in it.
     */
    ThreadLocalMap(ThreadLocal<?> firstKey, Object firstValue) {
        table = new Entry[INITIAL_CAPACITY];
        int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);
        table[i] = new Entry(firstKey, firstValue);
        size = 1;
        setThreshold(INITIAL_CAPACITY);
    }

    ...

}

The core part of the ThreadLocalMap class is the Entry class, which extends WeakReference. It ensures that if the current thread exits, it will be garbage collected automatically. This is why it uses ThreadLocalMap instead of a simple HashMap. It passes the current ThreadLocal and its value as the parameter of the Entry class, so when we want to get the value, we could get it from table, which is an instance of the Entry class:

public T get() {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        ThreadLocalMap.Entry e = map.getEntry(this);
        if (e != null) {
            @SuppressWarnings("unchecked")
            T result = (T)e.value;
            return result;
        }
    }
    return setInitialValue();
}

This is what is like in the whole picture:

The Whole Picture

黯然#的苍凉 2024-08-05 01:28:54

从概念上讲,您可以将 ThreadLocal 视为持有一个存储线程特定值的 Map,尽管实际情况并非如此实施的。

线程特定值存储在 Thread 对象本身中; 当线程终止时,线程特定的值可以被垃圾收集。

参考:JCIP

Conceptually, you can think of a ThreadLocal<T> as holding a Map<Thread,T> that stores the thread-specific values, though this is not how it is actually implemented.

The thread-specific values are stored in the Thread object itself; when the thread terminates, the thread-specific values can be garbage collected.

Reference : JCIP

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文