终结器在其对象仍在使用时启动
摘要: C#/.NET 应该进行垃圾收集。 C#有一个析构函数,用于清理资源。 当对象 A 在我尝试克隆其变量成员之一的同一行被垃圾收集时会发生什么? 显然,在多处理器上,有时垃圾收集器会获胜...
问题
今天,在 C# 培训课程上,老师向我们展示了一些仅在多处理器上运行时才包含错误的代码。
我总结一下,有时,编译器或 JIT 在从其被调用方法返回之前调用 C# 类对象的终结器会搞砸。
Visual C++ 2005 文档中给出的完整代码将作为“答案”发布,以避免提出非常大的问题,但要点如下:
下面的类有一个“Hash”属性,它将返回一个内部数组。 在构造时,数组的第一项值为 2。在析构函数中,其值设置为零。
要点是:如果您尝试获取“Example”的“Hash”属性,您将获得数组的干净副本,其第一项仍然是 2,因为该对象正在被使用(因此,不被使用)垃圾收集/最终确定):
public class Example
{
private int nValue;
public int N { get { return nValue; } }
// The Hash property is slower because it clones an array. When
// KeepAlive is not used, the finalizer sometimes runs before
// the Hash property value is read.
private byte[] hashValue;
public byte[] Hash { get { return (byte[])hashValue.Clone(); } }
public Example()
{
nValue = 2;
hashValue = new byte[20];
hashValue[0] = 2;
}
~Example()
{
nValue = 0;
if (hashValue != null)
{
Array.Clear(hashValue, 0, hashValue.Length);
}
}
}
但没有什么是那么简单...... 使用此类的代码在线程内工作,当然,对于测试,应用程序是高度多线程的:
public static void Main(string[] args)
{
Thread t = new Thread(new ThreadStart(ThreadProc));
t.Start();
t.Join();
}
private static void ThreadProc()
{
// running is a boolean which is always true until
// the user press ENTER
while (running) DoWork();
}
DoWork 静态方法是发生问题的代码:
private static void DoWork()
{
Example ex = new Example();
byte[] res = ex.Hash; // [1]
// If the finalizer runs before the call to the Hash
// property completes, the hashValue array might be
// cleared before the property value is read. The
// following test detects that.
if (res[0] != 2)
{
// Oops... The finalizer of ex was launched before
// the Hash method/property completed
}
}
显然,每执行 1,000,000 次 DoWork,垃圾收集器就会执行一次它的魔力,并尝试回收“ex”,因为它不再在函数的剩余代码中被引用,而这一次,它比“Hash”get方法更快。 所以我们最终得到的是一个零字节数组的克隆,而不是正确的字节数组(第一项为 2)。
我的猜测是代码内联,它本质上用以下内容替换了 DoWork 函数中标记为 [1] 的行:
// Supposed inlined processing
byte[] res2 = ex.Hash2;
// note that after this line, "ex" could be garbage collected,
// but not res2
byte[] res = (byte[])res2.Clone();
如果我们假设 Hash2 是一个简单的访问器,编码如下:
// Hash2 code:
public byte[] Hash2 { get { return (byte[])hashValue; } }
那么,问题是: 这是吗应该在 C#/.NET 中以这种方式工作,或者这可以被视为 JIT 编译器的错误吗?
编辑
请参阅 Chris Brumme 和 Chris Lyons 的博客以获取解释。
http://blogs.msdn.com/cbrumme/archive/ 2003/04/19/51365.aspx
http://blogs.msdn.com/clyon/archive/ 2004/09/21/232445.aspx
每个人的答案都很有趣,但我无法选择一个比另一个更好。 所以我给了你们所有人+1...
抱歉
:-)
编辑2
尽管在相同条件下使用相同的代码(同时运行多个相同的可执行文件,发布模式, ETC。)
Summary: C#/.NET is supposed to be garbage collected. C# has a destructor, used to clean resources. What happen when an object A is garbage collected the same line I try to clone one of its variable members? Apparently, on multiprocessors, sometimes, the garbage collector wins...
The problem
Today, on a training session on C#, the teacher showed us some code which contained a bug only when run on multiprocessors.
I'll summarize to say that sometimes, the compiler or the JIT screws up by calling the finalizer of a C# class object before returning from its called method.
The full code, given in Visual C++ 2005 documentation, will be posted as an "answer" to avoid making a very very large questions, but the essential are below:
The following class has a "Hash" property which will return a cloned copy of an internal array. At is construction, the first item of the array has a value of 2. In the destructor, its value is set to zero.
The point is: If you try to get the "Hash" property of "Example", you'll get a clean copy of the array, whose first item is still 2, as the object is being used (and as such, not being garbage collected/finalized):
public class Example
{
private int nValue;
public int N { get { return nValue; } }
// The Hash property is slower because it clones an array. When
// KeepAlive is not used, the finalizer sometimes runs before
// the Hash property value is read.
private byte[] hashValue;
public byte[] Hash { get { return (byte[])hashValue.Clone(); } }
public Example()
{
nValue = 2;
hashValue = new byte[20];
hashValue[0] = 2;
}
~Example()
{
nValue = 0;
if (hashValue != null)
{
Array.Clear(hashValue, 0, hashValue.Length);
}
}
}
But nothing is so simple...
The code using this class is wokring inside a thread, and of course, for the test, the app is heavily multithreaded:
public static void Main(string[] args)
{
Thread t = new Thread(new ThreadStart(ThreadProc));
t.Start();
t.Join();
}
private static void ThreadProc()
{
// running is a boolean which is always true until
// the user press ENTER
while (running) DoWork();
}
The DoWork static method is the code where the problem happens:
private static void DoWork()
{
Example ex = new Example();
byte[] res = ex.Hash; // [1]
// If the finalizer runs before the call to the Hash
// property completes, the hashValue array might be
// cleared before the property value is read. The
// following test detects that.
if (res[0] != 2)
{
// Oops... The finalizer of ex was launched before
// the Hash method/property completed
}
}
Once every 1,000,000 excutions of DoWork, apparently, the Garbage Collector does its magic, and tries to reclaim "ex", as it is not anymore referenced in the remaning code of the function, and this time, it is faster than the "Hash" get method. So what we have in the end is a clone of a zero-ed byte array, instead of having the right one (with the 1st item at 2).
My guess is that there is inlining of the code, which essentially replaces the line marked [1] in the DoWork function by something like:
// Supposed inlined processing
byte[] res2 = ex.Hash2;
// note that after this line, "ex" could be garbage collected,
// but not res2
byte[] res = (byte[])res2.Clone();
If we supposed Hash2 is a simple accessor coded like:
// Hash2 code:
public byte[] Hash2 { get { return (byte[])hashValue; } }
So, the question is: Is this supposed to work that way in C#/.NET, or could this be considered as a bug of either the compiler of the JIT?
edit
See Chris Brumme's and Chris Lyons' blogs for an explanation.
http://blogs.msdn.com/cbrumme/archive/2003/04/19/51365.aspx
http://blogs.msdn.com/clyon/archive/2004/09/21/232445.aspx
Everyone's answer was interesting, but I couldn't choose one better than the other. So I gave you all a +1...
Sorry
:-)
Edit 2
I was unable to reproduce the problem on Linux/Ubuntu/Mono, despite using the same code on the same conditions (multiple same executable running simultaneously, release mode, etc.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
完整代码
您将在下面找到从 Visual C++ 2008 .cs 文件复制/粘贴的完整代码。 由于我现在使用的是 Linux,并且没有任何 Mono 编译器或有关其使用的知识,所以我现在无法进行测试。 不过,几个小时前,我看到了这段代码的工作原理及其错误:
对于那些感兴趣的人,我可以通过电子邮件发送压缩的项目。
The Full Code
You'll find below the full code, copy/pasted from a Visual C++ 2008 .cs file. As I'm now on Linux, and without any Mono compiler or knowledge about its use, there's no way I can do tests now. Still, a couple of hours ago, I saw this code work and its bug:
For those interested, I can send the zipped project through email.
在 do work 方法中调用终结器是完全正常的,就像在
ex.Hash 调用,CLR 知道不再需要 ex 实例...
现在,如果您想让实例保持活动状态,请执行以下操作:
GC.KeepAlive 不会...什么都没有:)它是一个空的不可内联的 / jittable 方法,其唯一目的是欺骗 GC 认为此后将使用该对象。
警告:如果 DoWork 方法是托管 C++ 方法,则您的示例完全有效...如果您不希望从内部调用析构函数,则必须手动使托管实例保持活动状态另一个线程。 IE。 您传递一个对托管对象的引用,该对象将在完成时删除非托管内存的 blob,并且该方法正在使用同一个 blob。 如果您不保持实例处于活动状态,则 GC 和方法的线程之间将出现竞争条件。
这最终会让人流泪。 并管理堆损坏......
That's perfectly nornal for the finalizer to be called in your do work method as after the
ex.Hash call, the CLR knows that the ex instance won't be needed anymore...
Now, if you want to keep the instance alive do this:
GC.KeepAlive does... nothing :) it's an empty not inlinable /jittable method whose only purpose is to trick the GC into thinking the object will be used after this.
WARNING: Your example is perfectly valid if the DoWork method were a managed C++ method... You DO have to manually keep the managed instances alive manually if you don't want the destructor to be called from within another thread. IE. you pass a reference to a managed object who is going to delete a blob of unmanaged memory when finalized, and the method is using this same blob. If you don't hold the instance alive, you're going to have a race condition between the GC and your method's thread.
And this will end up in tears. And managed heap corruption...
的有趣评论
Chris Brumme 的博客http://blogs.msdn 。 com/cbrumme/archive/2003/04/19/51365.aspx
Interesting comment from Chris Brumme's blog
http://blogs.msdn.com/cbrumme/archive/2003/04/19/51365.aspx
是的,这是一个问题 具有
更有趣的是,您需要运行发布才能发生这种情况,而您最终会绞尽脑汁“哈,那怎么可能是空的?”。
Yes, this is an issue that has come up before.
Its even more fun in that you need to run release for this to happen and you end up stratching your head going 'huh, how can that be null?'.
我认为您所看到的是合理行为,因为事情是在多个线程上运行的。 这就是使用 GC.KeepAlive() 方法的原因,在这种情况下应该使用该方法来告诉 GC 该对象仍在使用并且它不是清理的候选者。
查看“完整代码”响应中的 DoWork 函数,问题是在这行代码之后:
该函数不再对 ex 对象进行任何引用,因此它有资格进行垃圾回收在那时候。 添加对 GC.KeepAlive 的调用可以防止这种情况发生。
I think what you are seeing is reasonable behavior due to the fact that things are running on multiple threads. This is the reason for the GC.KeepAlive() method, which should be used in this case to tell the GC that the object is still being used and that it isn't a candidate for cleanup.
Looking at the DoWork function in your "full code" response, the problem is that immediately after this line of code:
the function no longer makes any references to the ex object, so it becomes eligible for garbage collection at that point. Adding the call to GC.KeepAlive would prevent this from happening.
这看起来像是工作线程和 GC 线程之间的竞争条件; 为了避免这种情况,我认为有两种选择:
(1)更改 if 语句以使用 ex.Hash[0] 而不是 res,以便 ex 不能过早地被 GC,或者
(2)在持续时间内锁定 ex对 Hash 的调用
是一个非常漂亮的例子 - 老师的观点是 JIT 编译器中可能存在一个仅在多核系统上出现的错误,或者这种编码可能会与垃圾收集产生微妙的竞争条件?
this looks like a race condition between your work thread and the GC thread(s); to avoid it, i think there are two options:
(1) change your if statement to use ex.Hash[0] instead of res, so that ex cannot be GC'd prematurely, or
(2) lock ex for the duration of the call to Hash
that's a pretty spiffy example - was the teacher's point that there may be a bug in the JIT compiler that only manifests on multicore systems, or that this kind of coding can have subtle race conditions with garbage collection?
你所看到的一切是完全自然的。
您不保留对拥有字节数组的对象的引用,因此该对象(不是字节数组)实际上可以免费供垃圾收集器收集。
垃圾收集器确实可以如此激进。
因此,如果您在对象上调用一个方法,该方法返回对内部数据结构的引用,并且对象的终结器弄乱了该数据结构,则您还需要保留对该对象的实时引用。
垃圾收集器发现 ex 变量不再在该方法中使用,因此它可以,并且正如您所注意到的,将在正确的情况下(即时机和需要)对其进行垃圾收集。
正确的方法是在 ex 上调用 GC.KeepAlive,因此将这行代码添加到方法的底部,一切都应该很好:
我通过阅读这本书了解了这种攻击性行为 应用 .NET Framework 编程,作者:Jeffrey Richter。
What you're seeing is perfectly natural.
You don't keep a reference to the object that owns the byte array, so that object (not the byte array) is actually free for the garbage collector to collect.
The garbage collector really can be that aggressive.
So if you call a method on your object, which returns a reference to an internal data structure, and the finalizer for your object mess up that data structure, you need to keep a live reference to the object as well.
The garbage collector sees that the ex variable isn't used in that method any more, so it can, and as you notice, will garbage collect it under the right circumstances (ie. timing and need).
The correct way to do this is to call GC.KeepAlive on ex, so add this line of code to the bottom of your method, and all should be well:
I learned about this aggressive behavior by reading the book Applied .NET Framework Programming by Jeffrey Richter.
这只是代码中的一个错误:终结器不应该访问托管对象。
实现终结器的唯一原因是释放非托管资源。 在这种情况下,您应该仔细实现标准 IDisposable 模式。
使用此模式,您可以实现受保护的方法“protected Dispose(bool dispose)”。 当从终结器调用此方法时,它会清理非托管资源,但不会尝试清理托管资源。
在您的示例中,您没有任何非托管资源,因此不应实现终结器。
It's simply a bug in your code: finalizers should not be accessing managed objects.
The only reason to implement a finalizer is to release unmanaged resources. And in this case, you should carefully implement the standard IDisposable pattern.
With this pattern, you implement a protected method "protected Dispose(bool disposing)". When this method is called from the finalizer, it cleans up unmanaged resources, but does not attempt to clean up managed resources.
In your example, you don't have any unmanaged resources, so should not be implementing a finalizer.