Python 中的线程本地存储
如何在 Python 中使用线程本地存储?
相关
- 什么是“线程本地” Python 中的“存储”,为什么我需要它? - 该线程似乎更关注变量何时共享。
- 在 Python 中确定特定函数是否在堆栈上的有效方法 - Alex Martelli 给出了一个很好的解决方案
How do I use thread local storage in Python?
Related
- What is “thread local storage” in Python, and why do I need it? - This thread appears to be focused more on when variables are shared.
- Efficient way to determine whether a particular function is on the stack in Python - Alex Martelli gives a nice solution
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
例如,如果您有一个线程工作池并且每个线程需要访问自己的资源(例如网络或数据库连接),则线程本地存储非常有用。请注意,
threading
模块使用常规的线程概念(可以访问进程全局数据),但由于全局解释器锁,这些概念并不太有用。不同的多处理模块为每个模块创建一个新的子进程,因此任何全局都将是线程局部的。threading 模块
这是一个简单的例子:
这将打印出:
一件很容易被忽视的重要事情:一个
threading.local()
对象只需要创建一次,而不是每个线程一次,也不是每个函数一次称呼。global
或class
级别是理想的位置。原因如下:
threading.local()
实际上每次调用时都会创建一个新实例(就像任何工厂或类调用一样),因此调用threading.local()
> 多次不断地覆盖原始对象,这很可能不是人们想要的。当任何线程访问现有的 threadLocal 变量(或任何名称)时,它都会获得该变量自己的私有视图。这不会按预期工作:
将导致以下输出:
multiprocessing module
所有全局变量都是线程本地的,因为
multiprocessing
模块为每个线程创建一个新进程。考虑这个例子,其中
processed
计数器是线程本地存储的一个示例:它将输出类似这样的内容:
...当然,线程 ID 以及每个线程的计数和顺序将因运行而异运行。
Thread local storage is useful for instance if you have a thread worker pool and each thread needs access to its own resource, like a network or database connection. Note that the
threading
module uses the regular concept of threads (which have access to the process global data), but these are not too useful due to the global interpreter lock. The differentmultiprocessing
module creates a new sub-process for each, so any global will be thread local.threading module
Here is a simple example:
This will print out:
One important thing that is easily overlooked: a
threading.local()
object only needs to be created once, not once per thread nor once per function call. Theglobal
orclass
level are ideal locations.Here is why:
threading.local()
actually creates a new instance each time it is called (just like any factory or class call would), so callingthreading.local()
multiple times constantly overwrites the original object, which in all likelihood is not what one wants. When any thread accesses an existingthreadLocal
variable (or whatever it is called), it gets its own private view of that variable.This won't work as intended:
Will result in this output:
multiprocessing module
All global variables are thread local, since the
multiprocessing
module creates a new process for each thread.Consider this example, where the
processed
counter is an example of thread local storage:It will output something like this:
... of course, the thread IDs and the counts for each and order will vary from run to run.
线程本地存储可以简单地视为命名空间(通过属性表示法访问值)。不同之处在于每个线程透明地获取自己的一组属性/值,因此一个线程看不到来自另一个线程的值。
就像普通对象一样,您可以在代码中创建多个 threading.local 实例。它们可以是局部变量、类或实例成员、或者全局变量。每一个都是一个单独的命名空间。
下面是一个简单的示例:
输出:
请注意每个线程如何维护自己的计数器,即使
ns
属性是类成员(因此在线程之间共享)。同一个示例可以使用实例变量或局部变量,但这不会显示太多,因为那时没有共享(字典也可以工作)。在某些情况下,您需要线程局部存储作为实例变量或局部变量,但它们往往相对较少(而且非常微妙)。
Thread-local storage can simply be thought of as a namespace (with values accessed via attribute notation). The difference is that each thread transparently gets its own set of attributes/values, so that one thread doesn't see the values from another thread.
Just like an ordinary object, you can create multiple
threading.local
instances in your code. They can be local variables, class or instance members, or global variables. Each one is a separate namespace.Here's a simple example:
Output:
Note how each thread maintains its own counter, even though the
ns
attribute is a class member (and hence shared between the threads).The same example could have used an instance variable or a local variable, but that wouldn't show much, as there's no sharing then (a dict would work just as well). There are cases where you'd need thread-local storage as instance variables or local variables, but they tend to be relatively rare (and pretty subtle).
正如问题中所指出的,Alex Martelli 给出了一个解决方案 这里。该函数允许我们使用工厂函数为每个线程生成默认值。
As noted in the question, Alex Martelli gives a solution here. This function allows us to use a factory function to generate a default value for each thread.
我跨模块/文件进行线程本地存储的方式。以下内容已在 Python 3.5 中进行了测试 -
在 fileA 中,我启动了一个线程,该线程在另一个模块/文件中具有目标函数。
在 fileB 中,我在该线程中设置了一个我想要的局部变量。
在fileC中,我访问当前线程的线程局部变量。
此外,只需打印“字典”变量,以便您可以看到可用的默认值,例如 kwargs、args 等。
My way of doing a thread local storage across modules / files. The following has been tested in Python 3.5 -
In fileA, I start a thread which has a target function in another module/file.
In fileB, I set a local variable I want in that thread.
In fileC, I access the thread local variable of the current thread.
Additionally, just print 'dictionary' variable so that you can see the default values available, like kwargs, args, etc.
也可以这样写
mydata.x 只会存在于当前线程中
Can also write
mydata.x will only exist in the current thread