Cpython中的全球口译员锁(GIL)是什么?
什么是全球口译员锁,为什么是一个问题?
围绕从Python删除GIL的噪音很多,我想了解为什么这么重要。我自己从未写过编译器,也没有写过翻译,所以不要节俭细节,我可能需要他们理解。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
什么是全球口译员锁,为什么是一个问题?
围绕从Python删除GIL的噪音很多,我想了解为什么这么重要。我自己从未写过编译器,也没有写过翻译,所以不要节俭细节,我可能需要他们理解。
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(8)
Python的GIL旨在序列化来自不同线程的解释器内部访问。在多核系统上,这意味着多个线程无法有效利用多个内核。 (如果GIL没有导致这个问题,大多数人都不会关心GIL - 仅由于多核系统的患病率的增加而被提出为问题。)如果您想详细了解它,您可以查看此视频或查看这套幻灯片。这可能是太多的信息,但是您确实要求提供详细信息:-)
请注意,Python的GIL确实是Cpython(参考实现)的问题。 Jython和Ironpython没有GIL。作为Python开发人员,除非您正在编写C扩展名,否则通常不会遇到GIL。 C扩展作者需要在扩展程序阻止I/O时释放GIL,以便Python过程中的其他线程有机会运行。
Python's GIL is intended to serialize access to interpreter internals from different threads. On multi-core systems, it means that multiple threads can't effectively make use of multiple cores. (If the GIL didn't lead to this problem, most people wouldn't care about the GIL - it's only being raised as an issue because of the increasing prevalence of multi-core systems.) If you want to understand it in detail, you can view this video or look at this set of slides. It might be too much information, but then you did ask for details :-)
Note that Python's GIL is only really an issue for CPython, the reference implementation. Jython and IronPython don't have a GIL. As a Python developer, you don't generally come across the GIL unless you're writing a C extension. C extension writers need to release the GIL when their extensions do blocking I/O, so that other threads in the Python process get a chance to run.
假设您有多个线程,这些线程真的触摸彼此的数据。这些应该尽可能独立执行。如果您有一个“全局锁”,您需要(例如)调用功能,最终可能会成为瓶颈。首先,您可以从多个线程中获得很多好处。
将其置于现实世界中的类比:想象100名在一家只有一个咖啡杯的公司工作的开发人员。大多数开发人员会花时间等待咖啡而不是编码。
这些都不是特定于Python的 - 我不知道Python首先需要GIL的细节。但是,希望它可以使您对一般概念有更好的了解。
Suppose you have multiple threads which don't really touch each other's data. Those should execute as independently as possible. If you have a "global lock" which you need to acquire in order to (say) call a function, that can end up as a bottleneck. You can wind up not getting much benefit from having multiple threads in the first place.
To put it into a real world analogy: imagine 100 developers working at a company with only a single coffee mug. Most of the developers would spend their time waiting for coffee instead of coding.
None of this is Python-specific - I don't know the details of what Python needed a GIL for in the first place. However, hopefully it's given you a better idea of the general concept.
让我们首先了解Python GIL提供的内容:
任何操作/指令均在解释器中执行。吉尔确保解释器由特定时间的单个线程持有。带有多个线程的Python程序在单个解释器中起作用。在任何特定的时间,此解释器都由一个线程持有。这意味着只有持有解释器的线程在 的任何瞬间上运行。
现在为什么是一个问题:
您的计算机可能有多个内核/处理器。并且多个内核允许多个线程同时执行 ie多线可以在任何特定的时间瞬间执行。。
但是,由于解释器是由单个线程持有的,因此其他线程即使可以访问核心也没有做任何事情。因此,您没有获得多个内核提供的任何优势,因为在任何瞬间,即当前持有解释器的线程使用的核心都在使用。因此,您的程序将需要尽可能长时间执行,就好像它是一个线程程序一样。
但是,可能会阻止或长期运行的操作,例如I/O,图像处理和Numpy Number Crunching,发生在GIL之外。取自在这里。因此,对于此类操作,尽管存在GIL,但多线程操作仍然比单个螺纹操作更快。因此,吉尔并不总是瓶颈。
编辑:GIL是Cpython的实施细节。 Ironpython和Jython没有GIL,因此应该在其中进行真正的多线程程序,以为我从未使用过Pypy和Jython,并且不确定。
Let's first understand what the python GIL provides:
Any operation/instruction is executed in the interpreter. GIL ensures that interpreter is held by a single thread at a particular instant of time. And your python program with multiple threads works in a single interpreter. At any particular instant of time, this interpreter is held by a single thread. It means that only the thread which is holding the interpreter is running at any instant of time.
Now why is that an issue:
Your machine could be having multiple cores/processors. And multiple cores allow multiple threads to execute simultaneously i.e multiple threads could execute at any particular instant of time..
But since the interpreter is held by a single thread, other threads are not doing anything even though they have access to a core. So, you are not getting any advantage provided by multiple cores because at any instant only a single core, which is the core being used by the thread currently holding the interpreter, is being used. So, your program will take as long to execute as if it were a single threaded program.
However, potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Taken from here. So for such operations, a multithreaded operation will still be faster than a single threaded operation despite the presence of GIL. So, GIL is not always a bottleneck.
Edit: GIL is an implementation detail of CPython. IronPython and Jython don't have GIL, so a truly multithreaded program should be possible in them, thought I have never used PyPy and Jython and not sure of this.
python 3.7文档
我还想突出显示以下引用 python
螺纹
文档:这链接到
Globsary for
gromsy> code> global interage> gode>全局解释器锁定/code>
解释说GIL意味着Python中的螺纹并行性不适合 CPU绑定任务:
该报价还意味着dicts,因此可变分配也可以安全地作为CPYTHON实现细节:
,
多处理
解释如何通过产卵过程来克服GIL,同时公开类似于
螺纹
的接口:和说明它使用
多处理
作为后端:应与其他基类
threadpoolexecutor
形成鲜明对比,使用线程代替过程从中,我们得出结论,
threadpoolexecutor
仅适用于I/O绑定的任务,而ProcessPoolExecutor
也可以处理CPU绑定的任务。处理与线程实验
我已经对Python中的过程与线程进行了实验分析。
结果快速预览:
在其他语言中
该概念似乎也存在于Python之外,也将其应用于Ruby: https://en.wikipedia.org/wiki/wiki/global_inter_inter_interpreter_lock_lock
它提到了预期:
但是JVM似乎没有GIL就可以了,所以我想知道这是否值得。以下问题询问为什么吉尔首先存在:为什么全局解释器锁?
Python 3.7 documentation
I would also like to highlight the following quote from the Python
threading
documentation:This links to the Glossary entry for
global interpreter lock
which explains that the GIL implies that threaded parallelism in Python is unsuitable for CPU bound tasks:This quote also implies that dicts and thus variable assignment are also thread safe as a CPython implementation detail:
Next, the docs for the
multiprocessing
package explain how it overcomes the GIL by spawning process while exposing an interface similar to that ofthreading
:And the docs for
concurrent.futures.ProcessPoolExecutor
explain that it usesmultiprocessing
as a backend:which should be contrasted to the other base class
ThreadPoolExecutor
that uses threads instead of processesfrom which we conclude that
ThreadPoolExecutor
is only suitable for I/O bound tasks, whileProcessPoolExecutor
can also handle CPU bound tasks.Process vs thread experiments
At Multiprocessing vs Threading Python I've done an experimental analysis of process vs threads in Python.
Quick preview of the results:
In other languages
The concept seems to exist outside of Python as well, applying just as well to Ruby for example: https://en.wikipedia.org/wiki/Global_interpreter_lock
It mentions the advantages:
but the JVM seems to do just fine without the GIL, so I wonder if it is worth it. The following question asks why the GIL exists in the first place: Why the Global Interpreter Lock?
Python不允许在单词最真实的意义上进行多线程。它具有多线程包,但是如果您想多线程来加快代码的加快,那么使用它通常不是一个好主意。 Python有一个称为全球口译员锁(GIL)的结构。
https://www.youtube.com/watch?v=ph374fjqfpe
只有一个“线程”可以在任何时候执行。线程获取吉尔,做一些工作,然后将吉尔传递到下一个线程上。这很快就会发生,因此在人的眼中,您的线程似乎在并联执行,但实际上它们只是使用相同的CPU核心进行轮流。所有这些吉尔传递都为执行增加了开销。这意味着,如果您想使代码运行速度更快,那么通常使用线程软件包并不是一个好主意。
有理由使用Python的线程包。如果您想同时运行某些事情,而效率也不关心,那么它就可以完全方便。或者,如果您正在运行需要等待某些东西(例如某些IO)的代码,那么这可能很有意义。但是线程库不会让您使用额外的CPU内核。
可以将多线程外包到操作系统(通过进行多处理),一些调用您的Python代码的外部应用程序(例如,Spark或Hadoop)或一些Python代码调用的代码(例如,您可以拥有Python代码调用C函数,可执行昂贵的多线程功能)。
Python doesn't allow multi-threading in the truest sense of the word. It has a multi-threading package but if you want to multi-thread to speed your code up, then it's usually not a good idea to use it. Python has a construct called the Global Interpreter Lock (GIL).
https://www.youtube.com/watch?v=ph374fJqFPE
The GIL makes sure that only one of your 'threads' can execute at any one time. A thread acquires the GIL, does a little work, then passes the GIL onto the next thread. This happens very quickly so to the human eye it may seem like your threads are executing in parallel, but they are really just taking turns using the same CPU core. All this GIL passing adds overhead to execution. This means that if you want to make your code run faster then using the threading package often isn't a good idea.
There are reasons to use Python's threading package. If you want to run some things simultaneously, and efficiency is not a concern, then it's totally fine and convenient. Or if you are running code that needs to wait for something (like some IO) then it could make a lot of sense. But the threading library wont let you use extra CPU cores.
Multi-threading can be outsourced to the operating system (by doing multi-processing), some external application that calls your Python code (eg, Spark or Hadoop), or some code that your Python code calls (eg: you could have your Python code call a C function that does the expensive multi-threaded stuff).
每当两个线程访问相同变量时,您都会出现问题。
例如,在C ++中,避免问题的方法是定义一些静音锁,以防止两个线程同时输入对象的设置器。
python中可以进行多线程,但是不能同时执行两个线程
粒度比一份python指令更细。
运行线程正在获得一个名为GIL的全局锁。
这意味着,如果您开始编写一些多线程代码以利用多核处理器,则性能将无法改善。
通常的解决方法包括进行多进程。
请注意,如果您在C中写的方法中,则可以释放GIL。
GIL的使用不是Python固有的,而是其某些解释器,包括最常见的Cpython。
(#edited,请参阅评论)
GIL问题在Python 3000中仍然有效。
Whenever two threads have access to the same variable you have a problem.
In C++ for instance, the way to avoid the problem is to define some mutex lock to prevent two thread to, let's say, enter the setter of an object at the same time.
Multithreading is possible in python, but two threads cannot be executed at the same time
at a granularity finer than one python instruction.
The running thread is getting a global lock called GIL.
This means if you begin write some multithreaded code in order to take advantage of your multicore processor, your performance won't improve.
The usual workaround consists of going multiprocess.
Note that it is possible to release the GIL if you're inside a method you wrote in C for instance.
The use of a GIL is not inherent to Python but to some of its interpreter, including the most common CPython.
(#edited, see comment)
The GIL issue is still valid in Python 3000.
为什么Python(Cpython等)
从 。
该锁定是必要的,主要是因为Cpython的内存管理不是线程安全。
如何从Python中删除它?
像Lua一样,也许Python可以启动多个VM,但是Python不这样做,我想还有其他原因。
在Numpy或其他一些Python扩展库中,有时将GIL释放到其他线程可能会提高整个程序的效率。
Why Python (CPython and others) uses the GIL
From http://wiki.python.org/moin/GlobalInterpreterLock
In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython's memory management is not thread-safe.
How to remove it from Python?
Like Lua, maybe Python could start multiple VM, But python doesn't do that, I guess there should be some other reasons.
In Numpy or some other python extended library, sometimes, releasing the GIL to other threads could boost the efficiency of the whole programme.
我想分享一本书多线程的示例,以获得视觉效果。因此,这里是经典的死锁状况
,现在考虑到序列中的事件,导致了死锁。
I want to share an example from the book multithreading for Visual Effects. So here is a classic dead lock situation
Now consider the events in the sequence resulting a dead-lock.