什么是“线程本地存储” 在 Python 中,为什么我需要它?

发布于 2024-07-04 19:38:14 字数 203 浏览 13 评论 0原文

具体来说,在 Python 中,变量如何在线程之间共享?

尽管我之前使用过threading.Thread,但我从未真正理解或看到变量如何共享的示例。 它们是在主线程和子线程之间共享还是仅在子线程之间共享? 我什么时候需要使用线程本地存储来避免这种共享?

我已经看到许多关于使用锁同步线程间共享数据的访问的警告,但我还没有看到一个真正好的问题示例。

In Python specifically, how do variables get shared between threads?

Although I have used threading.Thread before I never really understood or saw examples of how variables got shared. Are they shared between the main thread and the children or only among the children? When would I need to use thread local storage to avoid this sharing?

I have seen many warnings about synchronizing access to shared data among threads by using locks but I have yet to see a really good example of the problem.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

橘香 2024-07-11 19:38:14

值得一提的是 threading.local() 不是单例。

每个线程可以使用更多它们。
不是一个存储

Worth mentioning threading.local() is not a singleton.

You can use more of them per thread.
It is not one storage.

白况 2024-07-11 19:38:14

我在这里可能是错的。 如果您知道其他情况,请详细说明,因为这将有助于解释为什么需要使用 thread local()。

这个说法似乎不对,但并没有错:“如果你想原子地修改另一个线程可以访问的任何内容,你必须用锁来保护它。” 我认为这个说法——>有效地<-是正确的,但并不完全准确。 我认为术语“原子”意味着 Python 解释器创建了一个字节码块,没有为 CPU 的中断信号留下空间。

我认为原子操作是无法访问中断的 Python 字节代码块。 像“running = True”这样的 Python 语句是原子的。 在这种情况下,您不需要锁定 CPU 以防止中断(我相信)。 Python 字节码分解不会受到线程中断的影响。

像“threads_running[5] = True”这样的Python代码不是原子的。 这里有两块Python字节码; 一个用于取消引用对象的 list() ,另一个字节代码块用于向对象分配值,在本例中为列表中的“位置”。 可以在两个字节码->块之间产生中断。 那就是坏事发生了。

线程 local() 与“原子”有何关系? 这就是为什么该声明对我来说似乎具有误导性。 如果没有你能解释一下吗?

I may be wrong here. If you know otherwise please expound as this would help explain why one would need to use thread local().

This statement seems off, not wrong: "If you want to atomically modify anything that another thread has access to, you have to protect it with a lock." I think this statement is ->effectively<- right but not entirely accurate. I thought the term "atomic" meant that the Python interpreter created a byte-code chunk that left no room for an interrupt signal to the CPU.

I thought atomic operations are chunks of Python byte code that does not give access to interrupts. Python statements like "running = True" is atomic. You do not need to lock CPU from interrupts in this case (I believe). The Python byte code breakdown is safe from thread interruption.

Python code like "threads_running[5] = True" is not atomic. There are two chunks of Python byte code here; one to de-reference the list() for an object and another byte code chunk to assign a value to an object, in this case a "place" in a list. An interrupt can be raised -->between<- the two byte-code ->chunks<-. That is were bad stuff happens.

How does thread local() relate to "atomic"? This is why the statement seems misdirecting to me. If not can you explain?

可是我不能没有你 2024-07-11 19:38:14

您可以使用threading.local()创建线程本地存储。

>>> tls = threading.local()
>>> tls.x = 4 
>>> tls.x
4

存储到 tls 的数据对于每个线程来说都是唯一的,这将有助于确保不会发生无意的共享。

You can create thread local storage using threading.local().

>>> tls = threading.local()
>>> tls.x = 4 
>>> tls.x
4

Data stored to the tls will be unique to each thread which will help ensure that unintentional sharing does not occur.

把回忆走一遍 2024-07-11 19:38:14

就像其他语言一样,Python 中的每个线程都可以访问相同的变量。 “主线程”和子线程之间没有区别。

与 Python 的一个区别是全局解释器锁意味着一次只能有一个线程运行 Python 代码。 然而,当涉及到同步访问时,这并没有多大帮助,因为所有常见的抢占问题仍然适用,并且您必须像在其他语言中一样使用线程原语。 然而,这确实意味着您需要重新考虑是否使用线程来提高性能。

Just like in every other language, every thread in Python has access to the same variables. There's no distinction between the 'main thread' and child threads.

One difference with Python is that the Global Interpreter Lock means that only one thread can be running Python code at a time. This isn't much help when it comes to synchronising access, however, as all the usual pre-emption issues still apply, and you have to use threading primitives just like in other languages. It does mean you need to reconsider if you were using threads for performance, however.

愛上了 2024-07-11 19:38:14

考虑以下代码:

#/usr/bin/env python

from time import sleep
from random import random
from threading import Thread, local

data = local()

def bar():
    print("I'm called from", data.v)

def foo():
    bar()

class T(Thread):
    def run(self):
        sleep(random())
        data.v = self.getName()   # Thread-1 and Thread-2 accordingly
        sleep(1)
        foo()
 >> T().start(); T().start()
I'm called from Thread-2
I'm called from Thread-1 

这里使用 threading.local() 作为一种快速而肮脏的方法,将一些数据从 run() 传递到 bar() ,而无需更改 foo() 的接口。

请注意,使用全局变量不会解决问题:

#/usr/bin/env python

from time import sleep
from random import random
from threading import Thread

def bar():
    global v
    print("I'm called from", v)

def foo():
    bar()

class T(Thread):
    def run(self):
        global v
        sleep(random())
        v = self.getName()   # Thread-1 and Thread-2 accordingly
        sleep(1)
        foo()
 >> T().start(); T().start()
I'm called from Thread-2
I'm called from Thread-2 

同时,如果您可以负担得起将此数据作为 foo() 的参数传递 - 这将是一种更优雅且设计良好的方式:

from threading import Thread

def bar(v):
    print("I'm called from", v)

def foo(v):
    bar(v)

class T(Thread):
    def run(self):
        foo(self.getName())

但这在使用时并不总是可行第三方或设计不当的代码。

Consider the following code:

#/usr/bin/env python

from time import sleep
from random import random
from threading import Thread, local

data = local()

def bar():
    print("I'm called from", data.v)

def foo():
    bar()

class T(Thread):
    def run(self):
        sleep(random())
        data.v = self.getName()   # Thread-1 and Thread-2 accordingly
        sleep(1)
        foo()
 >> T().start(); T().start()
I'm called from Thread-2
I'm called from Thread-1 

Here threading.local() is used as a quick and dirty way to pass some data from run() to bar() without changing the interface of foo().

Note that using global variables won't do the trick:

#/usr/bin/env python

from time import sleep
from random import random
from threading import Thread

def bar():
    global v
    print("I'm called from", v)

def foo():
    bar()

class T(Thread):
    def run(self):
        global v
        sleep(random())
        v = self.getName()   # Thread-1 and Thread-2 accordingly
        sleep(1)
        foo()
 >> T().start(); T().start()
I'm called from Thread-2
I'm called from Thread-2 

Meanwhile, if you could afford passing this data through as an argument of foo() - it would be a more elegant and well-designed way:

from threading import Thread

def bar(v):
    print("I'm called from", v)

def foo(v):
    bar(v)

class T(Thread):
    def run(self):
        foo(self.getName())

But this is not always possible when using third-party or poorly designed code.

疯狂的代价 2024-07-11 19:38:14

在Python中,除了函数局部变量之外,一切都是共享的(因为每个函数调用都有自己的一组局部变量,并且线程始终是单独的函数调用。)即使如此,也只有变量本身(引用对象的名称)是函数的局部变量; 对象本身总是全局的,任何东西都可以引用它们。
就这一点而言,特定线程的 Thread 对象并不是特殊对象。 如果您将 Thread 对象存储在所有线程都可以访问的地方(例如全局变量),那么所有线程都可以访问该一个 Thread 对象。 如果您想原子地修改另一个线程可以访问的任何内容,则必须使用锁来保护它。 当然,所有线程必须共享这个相同的锁,否则效果不会很好。

如果您想要实际的线程本地存储,那么 threading.local 就派上用场了。threading.local 的属性不在线程之间共享; 每个线程只能看到它自己放置在其中的属性。 如果您对它的实现感到好奇,源代码位于 _threading_local.py< /a> 在标准库中。

In Python, everything is shared, except for function-local variables (because each function call gets its own set of locals, and threads are always separate function calls.) And even then, only the variables themselves (the names that refer to objects) are local to the function; objects themselves are always global, and anything can refer to them.
The Thread object for a particular thread is not a special object in this regard. If you store the Thread object somewhere all threads can access (like a global variable) then all threads can access that one Thread object. If you want to atomically modify anything that another thread has access to, you have to protect it with a lock. And all threads must of course share this very same lock, or it wouldn't be very effective.

If you want actual thread-local storage, that's where threading.local comes in. Attributes of threading.local are not shared between threads; each thread sees only the attributes it itself placed in there. If you're curious about its implementation, the source is in _threading_local.py in the standard library.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文