到底什么是可重入函数?
计算机程序或例程是 如果可以,则描述为可重入 安全在其之前再次调用 之前的调用已经完成 (即可以安全地执行 同时)。为了可重入,a 计算机程序或例程:
- 不得保持静态(或全局) 非恒定数据。
- 不得将地址返回至 静态(或全局)非常数 数据。
- 必须仅适用于所提供的数据 由调用者调用。
- 不得依赖单例锁 资源。
- 不得修改自己的代码(除非 在自己独特的线程中执行 存储)
- 不得调用不可重入计算机 程序或例程。
如何安全定义?
如果一个程序可以安全地并发执行,那么它是否总是意味着它是可重入的?
在检查我的代码的可重入功能时,我应该记住的上述六点之间的共同点到底是什么?
另外,
- 所有递归函数都是可重入的吗?
- 所有线程安全函数都是可重入的吗?
- 所有递归和线程安全函数都是可重入的吗?
写这个问题的时候,我想到一件事: 像重入和线程安全这样的术语是绝对的吗?即它们有固定的具体定义吗?因为,如果不是的话,这个问题就没有多大意义。
Most of the times, the definition of reentrance is quoted from Wikipedia:
A computer program or routine is
described as reentrant if it can be
safely called again before its
previous invocation has been completed
(i.e it can be safely executed
concurrently). To be reentrant, a
computer program or routine:
- Must hold no static (or global)
non-constant data.- Must not return the address to
static (or global) non-constant
data.- Must work only on the data provided
to it by the caller.- Must not rely on locks to singleton
resources.- Must not modify its own code (unless
executing in its own unique thread
storage)- Must not call non-reentrant computer
programs or routines.
How is safely defined?
If a program can be safely executed concurrently, does it always mean that it is reentrant?
What exactly is the common thread between the six points mentioned that I should keep in mind while checking my code for reentrant capabilities?
Also,
- Are all recursive functions reentrant?
- Are all thread-safe functions reentrant?
- Are all recursive and thread-safe functions reentrant?
While writing this question, one thing comes to mind:
Are the terms like reentrance and thread safety absolute at all i.e. do they have fixed concrete definitions? For, if they are not, this question is not very meaningful.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
1. 如何定义安全?
从语义上来说。在这种情况下,这不是一个硬定义的术语。它只是意味着“你可以做到这一点,没有风险”。
2. 如果一个程序可以安全地并发执行,是否就一定意味着它是可重入的?
不。
例如,让我们有一个同时接受锁和回调作为参数的 C++ 函数:
另一个函数很可能需要锁定同一个互斥锁:
乍一看,一切似乎都很好……但是等等:
如果互斥锁上的锁不是递归的,那么在主线程中将会发生以下情况:
main
将调用foo
。foo
将获取锁。foo
将调用bar
,而bar
将调用foo
。foo
将尝试获取锁,失败并等待它被释放。好吧,我作弊了,使用了回调的东西。但很容易想象更复杂的代码片段也能产生类似的效果。
3. 在检查代码的可重入功能时,我应该牢记的上述六点之间的共同点到底是什么?
如果您的函数具有/允许访问可修改的持久资源,或者具有/允许访问闻到的函数,您就可以闻到问题。
(好吧,我们 99% 的代码应该有异味,然后...请参阅最后一节来处理该问题...)
因此,在研究您的代码时,其中一点应该提醒您:
请注意,不可重入是病毒性的:可以调用可能的不可重入函数的函数不能被视为可重入。
另请注意,C++ 方法闻起来因为它们可以访问
this
,因此您应该研究代码以确保它们没有有趣的交互。4.1.所有递归函数都是可重入的吗?
不可以。
在多线程情况下,访问共享资源的递归函数可能会被多个线程同时调用,从而导致数据错误/损坏。
在单线程情况下,递归函数可以使用不可重入函数(例如臭名昭著的
strtok
),或者使用全局数据而不处理数据已在使用的事实。因此,您的函数是递归的,因为它直接或间接调用自身,但它仍然可能是递归不安全。4.2.所有线程安全函数都是可重入的吗?
在上面的示例中,我展示了一个表面上线程安全的函数如何不可重入。好吧,我因为回调参数而作弊。但是,有多种方法可以通过让线程获取两次非递归锁来使线程死锁。
4.3.所有递归和线程安全函数都是可重入的吗?
如果“递归”的意思是“递归安全”,我会说“是”。
如果你能保证一个函数可以被多个线程同时调用,并且可以直接或间接调用自己,没有问题,那么它就是可重入的。
问题是评估这种保证... ^_^
5. 像重入和线程安全这样的术语是否是绝对的,即它们是否有固定的具体定义?
我相信他们确实如此,但是评估一个函数是线程安全的还是可重入的可能很困难。这就是我在上面使用术语 smell 的原因:您可以发现一个函数是不可重入的,但可能很难确定一段复杂的代码是可重入的
6. 一个示例
假设您有一个对象,有一个需要使用资源的方法:
第一个问题是,如果以某种方式递归调用该函数(即该函数直接或间接调用自身),代码可能会崩溃,因为
this->; p
将在最后一次调用结束时被删除,并且仍然可能在第一次调用结束之前使用。因此,此代码不是递归安全。
我们可以使用引用计数器来纠正这个问题:
这样,代码就变得递归安全了……但由于多线程问题,它仍然不可重入:我们必须确保
c
和的修改>p
将使用递归互斥体以原子方式完成(并非所有互斥体都是递归的):当然,这一切都假设
大量代码
本身可重入,包括使用p
。上面的代码甚至远不是异常安全,但这是另一个故事了...... ^_^
7. 嘿,我们 99% 的代码都是不可重入的!
对于意大利面条式代码来说确实如此。但如果你正确地分区你的代码,你将避免重入问题。
7.1.确保所有函数都没有状态
它们必须只使用参数、它们自己的局部变量、没有状态的其他函数,并且如果它们返回的话,则返回数据的副本。
7.2.确保您的对象是“递归安全的”
对象方法可以访问
this
,因此它与同一对象实例的所有方法共享状态。因此,请确保该对象可以在堆栈中的一个点(即调用方法 A)使用,然后在另一点(即调用方法 B)使用,而不会损坏整个对象。设计对象以确保在退出方法时,对象稳定且正确(没有悬空指针、没有矛盾的数据成员等)。
7.3.确保所有对象都正确封装
其他人不应该访问其内部数据:
如果用户检索数据的地址,即使返回 const 引用也可能很危险,因为代码的某些其他部分可能会在没有代码的情况下修改它持有被告知的常量引用。
7.4.确保用户知道您的对象不是线程安全的。
因此,用户有责任使用互斥体来使用线程之间共享的对象。
STL 中的对象被设计为非线程安全的(因为性能问题),因此,如果用户想要在两个线程之间共享
std::string
,则用户必须保护其对象使用并发原语进行访问;7.5。确保你的线程安全代码是递归安全的
这意味着如果你相信同一个资源可以被同一个线程使用两次,那么就使用递归互斥体。
1. How is safely defined?
Semantically. In this case, this is not a hard-defined term. It just mean "You can do that, without risk".
2. If a program can be safely executed concurrently, does it always mean that it is reentrant?
No.
For example, let's have a C++ function that takes both a lock, and a callback as a parameter:
Another function could well need to lock the same mutex:
At first sight, everything seems ok… But wait:
If the lock on mutex is not recursive, then here's what will happen, in the main thread:
main
will callfoo
.foo
will acquire the lock.foo
will callbar
, which will callfoo
.foo
will try to acquire the lock, fail and wait for it to be released.Ok, I cheated, using the callback thing. But it's easy to imagine more complex pieces of code having a similar effect.
3. What exactly is the common thread between the six points mentioned that I should keep in mind while checking my code for reentrant capabilities?
You can smell a problem if your function has/gives access to a modifiable persistent resource, or has/gives access to a function that smells.
(Ok, 99% of our code should smell, then… See last section to handle that…)
So, studying your code, one of those points should alert you:
Note that non-reentrancy is viral : A function that could call a possible non-reentrant function cannot be considered reentrant.
Note, too, that C++ methods smell because they have access to
this
, so you should study the code to be sure they have no funny interaction.4.1. Are all recursive functions reentrant?
No.
In multithreaded cases, a recursive function accessing a shared resource could be called by multiple threads at the same moment, resulting in bad/corrupted data.
In singlethreaded cases, a recursive function could use a non-reentrant function (like the infamous
strtok
), or use global data without handling the fact the data is already in use. So your function is recursive because it calls itself directly or indirectly, but it can still be recursive-unsafe.4.2. Are all thread-safe functions reentrant?
In the example above, I showed how an apparently threadsafe function was not reentrant. OK, I cheated because of the callback parameter. But then, there are multiple ways to deadlock a thread by having it acquire twice a non-recursive lock.
4.3. Are all recursive and thread-safe functions reentrant?
I would say "yes" if by "recursive" you mean "recursive-safe".
If you can guarantee that a function can be called simultaneously by multiple threads, and can call itself, directly or indirectly, without problems, then it is reentrant.
The problem is evaluating this guarantee… ^_^
5. Are the terms like reentrance and thread safety absolute at all, i.e. do they have fixed concrete definitions?
I believe they do, but then, evaluating a function is thread-safe or reentrant can be difficult. This is why I used the term smell above: You can find a function is not reentrant, but it could be difficult to be sure a complex piece of code is reentrant
6. An example
Let's say you have an object, with one method that needs to use a resource:
The first problem is that if somehow this function is called recursively (i.e. this function calls itself, directly or indirectly), the code will probably crash, because
this->p
will be deleted at the end of the last call, and still probably be used before the end of the first call.Thus, this code is not recursive-safe.
We could use a reference counter to correct this:
This way, the code becomes recursive-safe… But it is still not reentrant because of multithreading issues: We must be sure the modifications of
c
and ofp
will be done atomically, using a recursive mutex (not all mutexes are recursive):And of course, this all assumes the
lots of code
is itself reentrant, including the use ofp
.And the code above is not even remotely exception-safe, but this is another story… ^_^
7. Hey 99% of our code is not reentrant!
It is quite true for spaghetti code. But if you partition correctly your code, you will avoid reentrancy problems.
7.1. Make sure all functions have NO state
They must only use the parameters, their own local variables, other functions without state, and return copies of the data if they return at all.
7.2. Make sure your object is "recursive-safe"
An object method has access to
this
, so it shares a state with all the methods of the same instance of the object.So, make sure the object can be used at one point in the stack (i.e. calling method A), and then, at another point (i.e. calling method B), without corrupting the whole object. Design your object to make sure that upon exiting a method, the object is stable and correct (no dangling pointers, no contradicting data members, etc.).
7.3. Make sure all your objects are correctly encapsulated
No one else should have access to their internal data:
Even returning a const reference could be dangerous if the user retrieves the address of the data, as some other portion of the code could modify it without the code holding the const reference being told.
7.4. Make sure the user knows your object is not thread-safe
Thus, the user is responsible to use mutexes to use an object shared between threads.
The objects from the STL are designed to be not thread-safe (because of performance issues), and thus, if a user want to share a
std::string
between two threads, the user must protect its access with concurrency primitives;7.5. Make sure your thread-safe code is recursive-safe
This means using recursive mutexes if you believe the same resource can be used twice by the same thread.
“安全”的定义完全符合常识——它意味着“正确地做自己的事情而不干扰其他事情”。你提到的六点非常清楚地表达了实现这一目标的要求。
您的 3 个问题的答案是 3ד否”。
所有递归函数都是可重入的吗?
不!
两个同时调用递归函数很容易搞砸彼此,如果
例如,它们访问相同的全局/静态数据。
所有线程安全函数都是可重入的吗?
不!
如果一个函数在并发调用时不会发生故障,那么该函数就是线程安全的。但这可以通过使用互斥锁来阻止第二个调用的执行直到第一个调用完成来实现,因此一次只有一个调用有效。可重入意味着并发执行而不干扰其他调用。
所有递归和线程安全函数都是可重入的吗?
不!
参见上文。
"Safely" is defined exactly as the common sense dictates - it means "doing its thing correctly without interfering with other things". The six points you cite quite clearly express the requirements to achieve that.
The answers to your 3 questions is 3× "no".
Are all recursive functions reentrant?
NO!
Two simultaneous invocations of a recursive function can easily screw up each other, if
they access the same global/static data, for example.
Are all thread-safe functions reentrant?
NO!
A function is thread-safe if it doesn't malfunction if called concurrently. But this can be achieved e.g. by using a mutex to block the execution of the second invocation until the first finishes, so only one invocation works at a time. Reentrancy means executing concurrently without interfering with other invocations.
Are all recursive and thread-safe functions reentrant?
NO!
See above.
共同点:
如果例程在中断时被调用,行为是否定义良好?
如果您有这样的函数:
那么它不依赖于任何外部状态。该行为已明确定义。
如果您有这样的函数:
结果在多个线程上没有很好地定义。如果时机错误,信息可能会丢失。
可重入函数的最简单形式是专门对传递的参数和常量值进行操作。其他任何事情都需要特殊处理,或者通常是不可重入的。当然,参数不能引用可变的全局变量。
The common thread:
Is the behavior well defined if the routine is called while it is interrupted?
If you have a function like this:
Then it is not dependent upon any external state. The behavior is well defined.
If you have a function like this:
The result is not well defined on multiple threads. Information could be lost if the timing was just wrong.
The simplest form of a reentrant function is something that operates exclusively on the arguments passed and constant values. Anything else takes special handling or, often, is not reentrant. And of course the arguments must not reference mutable globals.
现在我必须详细阐述我之前的评论。 @paercebal 答案不正确。在示例代码中,没有人注意到应该作为参数的互斥体实际上并未传入吗?
我对这个结论提出异议,我断言:为了使函数在并发情况下安全,它必须是可重入的。因此并发安全(通常写为线程安全)意味着可重入。
线程安全和可重入都与参数没有任何关系:我们谈论的是函数的并发执行,如果使用不适当的参数,这仍然是不安全的。
例如,memcpy() 是线程安全且可重入的(通常)。显然,如果使用来自两个不同线程的指向相同目标的指针进行调用,它将无法按预期工作。这就是 SGI 定义的要点,将责任放在客户端上,以确保客户端同步对相同数据结构的访问。
重要的是要理解,一般来说,让线程安全操作包含参数是无意义的。如果您做过任何数据库编程,您就会理解。什么是“原子”并且可能受到互斥体或某种其他技术保护的概念必然是用户概念:在数据库上处理事务可能需要多次不间断的修改。除了客户端程序员之外,谁能说哪些需要保持同步呢?
关键是“损坏”不一定会通过未序列化的写入来弄乱计算机上的内存:即使所有单独的操作都被序列化,损坏仍然可能发生。因此,当您询问函数是否是线程安全的或可重入的时,该问题意味着所有适当分隔的参数:使用耦合参数并不构成反例。
那里有很多编程系统:Ocaml 是其中之一,我认为 Python 也是如此,其中有很多不可重入的代码,但它使用全局锁来交错线程访问。这些系统不可重入,也不是线程安全或并发安全的,它们安全运行只是因为它们防止全局并发。
一个很好的例子是 malloc。它不可重入且不是线程安全的。这是因为它必须访问全局资源(堆)。使用锁并不能保证安全:它绝对不是可重入的。如果 malloc 的接口设计得当,则可以使其可重入且线程安全:
现在它可以是安全的,因为它将对单个堆的串行共享访问的责任转移给了客户端。特别是,如果存在单独的堆对象,则不需要任何工作。如果使用公共堆,客户端必须串行访问。在函数内部使用锁是不够的:只需考虑一个 malloc 锁定一个堆*,然后一个信号出现并在同一指针上调用 malloc:死锁:信号无法继续,并且客户端也不能,因为它被中断了。
一般来说,锁不会使事物变得线程安全。它们实际上通过不恰当地尝试管理客户端拥有的资源来破坏安全性。锁定必须由对象制造商完成,这是唯一知道创建了多少对象以及如何使用它们的代码。
Now I have to elaborate on my previous comment. @paercebal answer is incorrect. In the example code didn't anyone notice that the mutex which as supposed to be parameter wasn't actually passed in?
I dispute the conclusion, I assert: for a function to be safe in the presence of concurrency it must be re-entrant. Therefore concurrent-safe (usually written thread-safe) implies re-entrant.
Neither thread safe nor re-entrant have anything to say about arguments: we're talking about concurrent execution of the function, which can still be unsafe if inappropriate parameters are used.
For example, memcpy() is thread-safe and re-entrant (usually). Obviously it will not work as expected if called with pointers to the same targets from two different threads. That's the point of the SGI definition, placing the onus on the client to ensure accesses to the same data structure are synchronised by the client.
It is important to understand that in general it is nonsense to have thread-safe operation include the parameters. If you've done any database programming you will understand. The concept of what is "atomic" and might be protected by a mutex or some other technique is necessarily a user concept: processing a transaction on a database can require multiple un-interrupted modifications. Who can say which ones need to be kept in sync but the client programmer?
The point is that "corruption" doesn't have to be messing up the memory on your computer with unserialised writes: corruption can still occur even if all individual operations are serialised. It follows that when you're asking if a function is thread-safe, or re-entrant, the question means for all appropriately separated arguments: using coupled arguments does not constitute a counter-example.
There are many programming systems out there: Ocaml is one, and I think Python as well, which have lots of non-reentrant code in them, but which uses a global lock to interleave thread acesss. These systems are not re-entrant and they're not thread-safe or concurrent-safe, they operate safely simply because they prevent concurrency globally.
A good example is malloc. It is not re-entrant and not thread-safe. This is because it has to access a global resource (the heap). Using locks doesn't make it safe: it's definitely not re-entrant. If the interface to malloc had be design properly it would be possible to make it re-entrant and thread-safe:
Now it can be safe because it transfers the responsibility for serialising shared access to a single heap to the client. In particular no work is required if there are separate heap objects. If a common heap is used, the client has to serialise access. Using a lock inside the function is not enough: just consider a malloc locking a heap* and then a signal comes along and calls malloc on the same pointer: deadlock: the signal can't proceed, and the client can't either because it is interrupted.
Generally speaking, locks do not make things thread-safe .. they actually destroy safety by inappropriately trying to manage a resource that is owned by the client. Locking has to be done by the object manufacturer, thats the only code that knows how many objects are created and how they will be used.
列出的要点中的“公共线程”(双关语!?)是函数不得执行任何会影响对同一函数的任何递归或并发调用的行为的操作。
例如,静态数据是一个问题,因为它由所有线程拥有;如果一个调用修改了静态变量,则所有线程都会使用修改后的数据,从而影响它们的行为。自修改代码(虽然很少遇到,并且在某些情况下被阻止)会是一个问题,因为虽然有多个线程,但代码只有一份副本;代码也是必不可少的静态数据。
本质上,为了可重入,每个线程必须能够像唯一用户一样使用该函数,如果一个线程可以以非确定性方式影响另一个线程的行为,则情况并非如此。这主要涉及每个线程具有该函数所处理的单独数据或恒定数据。
综上所述,第 (1) 点不一定正确;例如,您可以合法且有意地使用静态变量来保留递归计数,以防止过度递归或分析算法。
线程安全函数不需要是可重入的;它可以通过专门使用锁来防止重入来实现线程安全,第(6)点表示这样的函数是不可重入的。关于第(6)点,调用锁定的线程安全函数的函数在递归中使用是不安全的(它将死锁),因此不被认为是可重入的,尽管它对于并发来说可能仍然是安全的,并且仍然是可重入的,因为多个线程可以同时在这样的函数中拥有它们的程序计数器(只是不使用锁定区域)。这可能有助于区分线程安全性和重入性(或者可能会增加您的困惑!)。
The "common thread" (pun intended!?) amongst the points listed is that the function must not do anything that would affect the behaviour of any recursive or concurrent calls to the same function.
So for example static data is an issue because it is owned by all threads; if one call modifies a static variable the all threads use the modified data thus affecting their behaviour. Self modifying code (although rarely encountered, and in some cases prevented) would be a problem, because although there are multiple thread, there is only one copy of the code; the code is essential static data too.
Essentially to be re-entrant, each thread must be able to use the function as if it were the only user, and that is not the case if one thread can affect the behaviour of another in a non-deterministic manner. Primarily this involves each thread having either separate or constant data that the function works on.
All that said, point (1) is not necessarily true; for example, you might legitimately and by design use a static variable to retain a recursion count to guard against excessive recursion or to profile an algorithm.
A thread-safe function need not be reentrant; it may achieve thread safety by specifically preventing reentrancy with a lock, and point (6) says that such a function is not reentrant. Regarding point (6), a function that calls a thread-safe function that locks is not safe for use in recursion (it will dead-lock), and is therefore not said to be reentrant, though it may nonetheless safe for concurrency, and would still be re-entrant in the sense that multiple threads can have their program-counters in such a function simultaneously (just not with the locked region). May be this helps to distinguish thread-safety from reentarncy (or maybe adds to your confusion!).
您的“另外”问题的答案是“否”、“否”和“否”。仅仅因为函数是递归的和/或线程安全的,它并不意味着它是可重入的。
这些类型的函数中的每一种都可能在您引用的所有点上失败。 (尽管我对第 5 点不是 100% 确定)。
The answers your "Also" questions are "No", "No" and "No". Just because a function is recursive and/or thread safe it doesn't make it re-entrant.
Each of these type of function can fail on all the points you quote. (Though I'm not 100% certain of point 5).
术语“线程安全”和“可重入”仅代表其定义的含义。在这种情况下,“安全”仅意味着您在下面引用的定义所说的内容。
这里的“安全”当然并不意味着更广泛意义上的安全,即在给定上下文中调用给定函数不会完全控制您的应用程序。总而言之,函数可能会在多线程应用程序中可靠地产生所需的效果,但根据定义不符合可重入或线程安全的条件。相反,您调用可重入函数的方式可能会在多线程应用程序中产生各种不希望的、意外的和/或不可预测的效果。
递归函数可以是任何函数,并且可重入函数的定义比线程安全函数更强,因此您的编号问题的答案都是“否”。
阅读可重入的定义,人们可能会将其概括为一种函数,除了您所说的要修改的内容之外,不会修改任何内容。但您不应该仅依赖摘要。
多线程编程极其困难 在一般情况下。知道代码的哪一部分可重入只是这一挑战的一部分。线程安全不是附加的。与其尝试拼凑可重入函数,不如使用整体 线程安全 设计模式并使用此模式来指导您的使用程序中的每个线程和共享资源。
The terms "Thread-safe" and "re-entrant" mean only and exactly what their definitions say. "Safe" in this context means only what the definition you quote below it says.
"Safe" here certainly doesn't mean safe in the broader sense that calling a given function in a given context won't totally hose your application. Altogether, a function might reliably produce a desired effect in your multi-threaded application but not qualify as either re-entrant or thread-safe according to the definitions. Oppositely, you can call re-entrant functions in ways that will produce a variety of undesired, unexpected and/or unpredictable effects in your multi-threaded application.
Recursive function can be anything and Re-entrant has a stronger definition than thread-safe so the answers to your numbered questions are all no.
Reading the definition of re-entrant, one might summarize it as meaning a function which will not modify any anything beyond what you call it to modify. But you shouldn't rely on only the summary.
Multi-threaded programming is just extremely difficult in the general case. Knowing which part of one's code re-entrant is only a part of this challenge. Thread safety is not additive. Rather than trying to piece together re-entrant functions, it's better to use an overall thread-safe design pattern and use this pattern to guide your use of every thread and shared resources in the your program.