.NET C# 不安全/固定不固定直通数组元素?

发布于 2024-10-30 19:36:55 字数 2550 浏览 1 评论 0原文

我有一些并发代码出现间歇性故障,我已将问题减少到两种看起来相同的情况,但其中一种失败,另一种则没有。

我现在花了太多时间尝试创建一个失败的最小完整示例,但没有成功,所以我只是发布失败的行,以防任何人都可以看到明显的问题。

Object lock = new Object();

struct MyValueType { readonly public int i1, i2; };
class Node { public MyValueType x; public int y; public Node z; };
volatile Node[] m_rg = new Node[300];

unsafe void Foo()
{
    Node[] temp;
    while (true)
    {
        temp = m_rg;
        /* ... */
        Monitor.Enter(lock);
        if (temp == m_rg)
            break;
        Monitor.Exit(lock);
    }

#if OK                                      // this works:
    Node cur = temp[33];
    fixed (MyValueType* pe = &cur.x)
        *(long*)pe = *(long*)&e;
#else                                       // this reliably causes random corruption:
    fixed (MyValueType* pe = &temp[33].x)
        *(long*)pe = *(long*)&e;
#endif

    Monitor.Exit(lock);
}

我研究了 IL 代码,看起来发生的情况是数组位置 33 处的 Node 对象正在移动(在极少数情况下),尽管我们持有指向其中值类型的指针。

就好像 CLR 没有注意到我们正在通过堆(可移动)对象(数组元素)传递来访问值类型。在 8 路机器上的扩展测试中,“OK”版本从未失败,但备用路径每次都会很快失败。

  • 这是否永远不会起作用,并且“OK”版本过于精简而不会在压力下失败?
  • 我是否需要使用 GCHandle 自己固定对象(我在 IL 中注意到 fixed 语句本身并没有这样做)?
  • 如果这里需要手动固定,为什么编译器允许以这种方式通过堆对象(不固定)进行访问?

注意:这个问题并不是讨论以一种令人讨厌的方式重新解释 blittable 值类型的优雅,所以请不要批评代码的这方面,除非它与手头的问题直接相关。谢谢

[编辑:jited asm] 感谢汉斯的回复,我更好地理解了为什么抖动将东西放在堆栈上,否则看起来像是空洞的汇编操作。例如,请参见 [rsp + 50h],以及它如何在“固定”区域之后被清零。剩下的未解决的问题是堆栈上的 [cur+18h](第 207-20C 行)是否足以以某种方式保护对值类型的访问,而这种方式对于 [temp+33 来说 *IntPtr.Size+18h](第 24A 行)。

在此处输入图像描述

[edit]

结论摘要,最小示例

比较下面的两个代码片段,我现在相信 #1不行,而#2 是可以接受的。

(1.) 以下失败(至少在 x64 jit 上);如果您尝试通过数组引用在原位修复它,GC 仍然可以移动 MyClass 实例。堆栈上没有地方可以发布特定对象实例(需要修复的数组元素)的引用,以供 GC 注意到。

struct MyValueType { public int foo; };
class MyClass { public MyValueType mvt; };
MyClass[] rgo = new MyClass[2000];

fixed (MyValueType* pvt = &rgo[1234].mvt)
    *(int*)pvt = 1234;

(2.) 但是,如果您在堆栈上提供了可以通告给的显式引用,则您可以使用固定(无需固定)来访问(可移动)对象内部的结构GC:

struct MyValueType { public int foo; };
class MyClass { public MyValueType mvt; };
MyClass[] rgo = new MyClass[2000];

MyClass mc = &rgo[1234];              // <-- only difference -- add this line
fixed (MyValueType* pvt = &mc.mvt)    // <-- and adjust accordingly here
    *(int*)pvt = 1234;

这是我将保留的地方,除非有人可以提供更正或更多信息......

I have some concurrent code which has an intermittent failure and I've reduced the problem down to two cases which seem identical, but where one fails and the other doesn't.

I've now spent way too much time trying to create a minimal, complete example that fails, but without success, so I'm just posting the lines that fail in case anyone can see an obvious problem.

Object lock = new Object();

struct MyValueType { readonly public int i1, i2; };
class Node { public MyValueType x; public int y; public Node z; };
volatile Node[] m_rg = new Node[300];

unsafe void Foo()
{
    Node[] temp;
    while (true)
    {
        temp = m_rg;
        /* ... */
        Monitor.Enter(lock);
        if (temp == m_rg)
            break;
        Monitor.Exit(lock);
    }

#if OK                                      // this works:
    Node cur = temp[33];
    fixed (MyValueType* pe = &cur.x)
        *(long*)pe = *(long*)&e;
#else                                       // this reliably causes random corruption:
    fixed (MyValueType* pe = &temp[33].x)
        *(long*)pe = *(long*)&e;
#endif

    Monitor.Exit(lock);
}

I have studied the IL code and it looks like what's happening is that the Node object at array position 33 is moving (in very rare cases) despite the fact that we are holding a pointer to a value type within it.

It's as if the CLR doesn't notice that we are passing through a heap (movable) object--the array element--in order to access the value type. The 'OK' version has never failed under extended testing on an 8-way machine, but the alternate path fails quickly every time.

  • Is this never supposed to work, and 'OK' version is too streamlined to fail under stress?
  • Do I need to pin the object myself using GCHandle (I notice in the IL that the fixed statement alone is not doing so)?
  • If manual pinning is required here, why is the compiler allowing access through a heap object (without pinning) in this way?

note: This question is not discussing the elegance of reinterpreting the blittable value type in a nasty way, so please, no criticism of this aspect of the code unless it is directly relevant to the problem at hand.. thanks

[edit: jitted asm]
Thanks to Hans' reply, I understand better why the jitter is placing things on the stack in what otherwise seem like vacuous asm operations. See [rsp + 50h] for example, and how it gets nulled out after the 'fixed' region. The remaining unresolved question is whether [cur+18h] (lines 207-20C) on the stack is somehow sufficient to protect the access to the value type in a way that is not adequate for [temp+33*IntPtr.Size+18h] (line 24A).

enter image description here

[edit]

summary of conclusions, minimal example

Comparing the two code fragments below, I now believe that #1 is not ok, whereas #2 is acceptable.

(1.) The following fails (on x64 jit at least); GC can still move the MyClass instance if you try to fix it in-situ, via an array reference. There's no place on the stack for the reference of the particular object instance (the array element that needs to be fixed) to be published, for the GC to notice.

struct MyValueType { public int foo; };
class MyClass { public MyValueType mvt; };
MyClass[] rgo = new MyClass[2000];

fixed (MyValueType* pvt = &rgo[1234].mvt)
    *(int*)pvt = 1234;

(2.) But you can access a structure inside a (movable) object using fixed (without pinning) if you provide an explicit reference on the stack which can be advertised to the GC:

struct MyValueType { public int foo; };
class MyClass { public MyValueType mvt; };
MyClass[] rgo = new MyClass[2000];

MyClass mc = &rgo[1234];              // <-- only difference -- add this line
fixed (MyValueType* pvt = &mc.mvt)    // <-- and adjust accordingly here
    *(int*)pvt = 1234;

This is where I'll leave it unless someone can provide corrections or more information...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

王权女流氓 2024-11-06 19:36:55

通过固定指针修改托管类型的对象可能会导致未定义的行为
(C# 语言规范,第 18.6 章。)

好吧,您正在这样做。尽管规范和 MSDN 库中有很多措辞,但 fixed 关键字实际上并不会使对象不可移动,它不会被固定。您可能通过查看 IL 发现了这一点。它使用了一个巧妙的技巧,通过生成指针+偏移量并让垃圾收集器调整指针。我没有很好的解释为什么在一种情况下失败而在另一种情况下失败。我没有看到生成的机器代码有根本的区别。但后来我可能也没有重现您的确切机器代码,该片段不是很好。

据我所知,由于结构成员访问,这两种情况都应该失败。这会导致指针 + 偏移量通过 LEA 指令折叠为单个指针,从而阻止垃圾收集器识别该引用。结构一直是抖动的麻烦所在。也许线程计时可以解释这种差异。

您可以发帖到 connect.microsoft.com 寻求第二意见。然而,解决规范违规问题将会很困难。如果我的理论是正确的,那么读取也可能会失败,但证明起来要困难得多。

通过使用 GCHandle 实际固定数组来修复它。

Modifying objects of managed type through fixed pointers can results in undefined behavior
(C# Language specification, chapter 18.6.)

Well, you are doing just that. In spite of the verbiage in the spec and the MSDN library, the fixed keyword does not in fact make the object unmoveable, it doesn't get pinned. You probably found out from looking at the IL. It uses a clever trick by generating a pointer + offset and letting the garbage collector adjust the pointer. I don't have a great explanation why this fails in one case but not the other. I don't see a fundamental difference in the generated machine code. But then I probably didn't reproduce your exact machine code either, the snippet isn't great.

As near as I can tell it should fail in both cases because of the structure member access. That causes the pointer + offset to collapse to a single pointer with a LEA instruction, preventing the garbage collector from recognizing the reference. Structures have always been trouble for the jitter. Thread timing could explain the difference, perhaps.

You could post to connect.microsoft.com for a second opinion. It is however going to be difficult to navigate around the spec violation. If my theory is correct then a read could fail too, much harder to prove though.

Fix it by actually pinning the array with GCHandle.

娇柔作态 2024-11-06 19:36:55

对此感到困惑,我在这里猜测,编译器似乎正在采用 &temp (指向 tmp 数组的固定指针),然后用 [33] 对其进行索引。因此,您要固定临时数组,而不是节点。尝试...

fixed (MyValueType* pe = &(temp[33]).x)
    *(long*)pe = *(long*)&e;

Puzzling over this, and I'm guessing here, it looks like the compiler is taking &temp (fixed pointer to the tmp array) then indexing that with [33]. So you're pinning the temp array, rather than the node. Try...

fixed (MyValueType* pe = &(temp[33]).x)
    *(long*)pe = *(long*)&e;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文