测试最终字段的初始化安全性

发布于 2024-10-18 04:37:05 字数 1447 浏览 2 评论 0原文

我试图简单地测试 JLS 保证的最终字段的初始化安全性。这是为了我正在写的一篇论文。但是，根据我当前的代码，我无法让它“失败”。有人可以告诉我我做错了什么，或者这只是我必须一遍又一遍地运行然后在一些不幸的时机看到失败的事情吗？

这是我的代码：

public class TestClass {

    final int x;
    int y;
    static TestClass f;

    public TestClass() {
        x = 3;
        y = 4;
    }

    static void writer() {
        TestClass.f = new TestClass();
    }

    static void reader() {
        if (TestClass.f != null) {
            int i = TestClass.f.x; // guaranteed to see 3
            int j = TestClass.f.y; // could see 0

            System.out.println("i = " + i);
            System.out.println("j = " + j);
        }
    }
}

我的线程是这样调用它的：

public class TestClient {

    public static void main(String[] args) {

        for (int i = 0; i < 10000; i++) {
            Thread writer = new Thread(new Runnable() {
                @Override
                public void run() {
                    TestClass.writer();
                }
            });

            writer.start();
        }

        for (int i = 0; i < 10000; i++) {
            Thread reader = new Thread(new Runnable() {
                @Override
                public void run() {
                    TestClass.reader();
                }
            });

            reader.start();
        }
    }
}

我已经运行了这个场景很多很多次。我当前的循环生成了 10,000 个线程，但我已经完成了 1000、100000、甚至 100 万个线程。还是没有失败。我总是看到 3 和 4 这两个值。我怎样才能让这个失败？

原文

I am trying to simply test out the initialization safety of final fields as guaranteed by the JLS. It is for a paper I'm writing. However, I am unable to get it to 'fail' based on my current code. Can someone tell me what I'm doing wrong, or if this is just something I have to run over and over again and then see a failure with some unlucky timing?

Here is my code:

public class TestClass {

    final int x;
    int y;
    static TestClass f;

    public TestClass() {
        x = 3;
        y = 4;
    }

    static void writer() {
        TestClass.f = new TestClass();
    }

    static void reader() {
        if (TestClass.f != null) {
            int i = TestClass.f.x; // guaranteed to see 3
            int j = TestClass.f.y; // could see 0

            System.out.println("i = " + i);
            System.out.println("j = " + j);
        }
    }
}

and my threads are calling it like this:

public class TestClient {

    public static void main(String[] args) {

        for (int i = 0; i < 10000; i++) {
            Thread writer = new Thread(new Runnable() {
                @Override
                public void run() {
                    TestClass.writer();
                }
            });

            writer.start();
        }

        for (int i = 0; i < 10000; i++) {
            Thread reader = new Thread(new Runnable() {
                @Override
                public void run() {
                    TestClass.reader();
                }
            });

            reader.start();
        }
    }
}

I have run this scenario many, many times. My current loops are spawning 10,000 threads, but I've done with this 1000, 100000, and even a million. Still no failure. I always see 3 and 4 for both values. How can I get this to fail?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

樱＆纷飞 2024-10-25 04:37:05

我写了规范。 TL；这个答案的 DR 版本是，仅仅因为它可能看到 y 为 0，并不意味着它保证看到 y 为 0。

在这种情况下，最终的字段规范保证您将看到 x 为 3，正如您所指出的。将编写器线程视为有 4 条指令：

r1 = <create a new TestClass instance>
r1.x = 3;
r1.y = 4;
f = r1;

您可能看不到 x 的 3 的原因是编译器是否重新排序了此代码：

r1 = <create a new TestClass instance>
f = r1;
r1.x = 3;
r1.y = 4;

在实践中通常实现最终字段的保证的方式是确保构造函数在任何后续程序之前完成行动发生。想象一下有人在 r1.y = 4 和 f = r1 之间竖起了一个大障碍。因此，在实践中，如果您有某个对象的任何最终字段，您可能会获得所有这些字段的可见性。

现在，从理论上讲，有人可以编写一个不以这种方式实现的编译器。事实上，许多人经常谈论通过编写最恶意的编译器来测试代码。这在 C++ 人员中尤其常见，他们的语言中有很多未定义的角落，可能会导致可怕的错误。

I wrote the spec. The TL; DR version of this answer is that just because it may see 0 for y, that doesn't mean it is guaranteed to see 0 for y.

In this case, the final field spec guarantees that you will see 3 for x, as you point out. Think of the writer thread as having 4 instructions:

r1 = <create a new TestClass instance>
r1.x = 3;
r1.y = 4;
f = r1;

The reason you might not see 3 for x is if the compiler reordered this code:

r1 = <create a new TestClass instance>
f = r1;
r1.x = 3;
r1.y = 4;

The way the guarantee for final fields is usually implemented in practice is to ensure that the constructor finishes before any subsequent program actions take place. Imagine someone erected a big barrier between r1.y = 4 and f = r1. So, in practice, if you have any final fields for an object, you are likely to get visibility for all of them.

Now, in theory, someone could write a compiler that isn't implemented that way. In fact, many people have often talked about testing code by writing the most malicious compiler possible. This is particularly common among the C++ people, who have lots and lots of undefined corners of their language that can lead to terrible bugs.

回复收藏 0 原文

莫言歌 2024-10-25 04:37:05

从 Java 5.0 开始，保证所有线程都会看到构造函数设置的最终状态。

如果您想看到此失败，您可以尝试使用较旧的 JVM，例如 1.3。

我不会打印出每个测试，我只会打印出失败的结果。你可能会遇到百万分之一的失败，但却错过了它。但如果你只打印故障，它们应该很容易被发现。

查看此失败的更简单方法是添加到 writer。

f.y = 5;

并测试

int y = TestClass.f.y; // could see 0, 4 or 5
if (y != 5)
    System.out.println("y = " + y);

From Java 5.0, you are guarenteed that all threads will see the final state set by the constructor.

If you want to see this fail, you could try an older JVM like 1.3.

I wouldn't print out every test, I would only print out the failures. You could get one failure in a million but miss it. But if you only print failures, they should be easy to spot.

A simpler way to see this fail is to add to the writer.

f.y = 5;

and test for

int y = TestClass.f.y; // could see 0, 4 or 5
if (y != 5)
    System.out.println("y = " + y);

回复收藏 0 原文

何以畏孤独 2024-10-25 04:37:05

我希望看到失败的测试或解释为什么当前 JVM 无法实现这一点。

多线程和测试

您无法通过测试来证明多线程应用程序已损坏（或未损坏），原因如下：

该问题可能仅在运行 x 小时后出现一次，x 太高以至于它您不太可能在简短的测试中看到该
问题，该问题可能仅出现在 JVM/处理器架构的某些组合中。

在您的情况下，要使测试中断（即观察 y == 0），需要程序查看部分构造的对象有些田地已经正确建造，有些则没有。这通常不会发生在 x86/热点上。

如何确定多线程代码是否损坏？

证明代码有效或损坏的唯一方法是对其应用 JLS 规则并查看结果是什么。对于数据竞争发布（对象或 y 的发布没有同步），JLS 不保证 y 将被视为 4（可以通过其默认值 0 看到）。

该代码真的会损坏吗？

在实践中，某些 JVM 更擅长使测试失败。例如一些编译器（参见本文）可以将 TestClass.f = new TestClass(); 转换为类似的内容（因为它是通过数据竞争发布的）：

(1) allocate memory
(2) write fields default values (x = 0; y = 0) //always first
(3) write final fields final values (x = 3)    //must happen before publication
(4) publish object                             //TestClass.f = new TestClass();
(5) write non final fields (y = 4)             //has been reodered after (4)

JLS 要求 (2) 和(3) 发生在对象发布 (4) 之前。但是，由于数据竞争，无法保证 (5) - 如果线程从未观察到该写入操作，它实际上是合法执行。因此，通过适当的线程交错，可以想象，如果 reader 在 4 到 5 之间运行，您将获得所需的输出。

我手头没有 symantec JIT，因此无法通过实验证明:-)

I'd like to see a test which fails or an explanation why it's not possible with current JVMs.

Multithreading and Testing

You can't prove that a multithreaded application is broken (or not) by testing for several reasons:

the problem might only appear once every x hours of running, x being so high that it is unlikely that you see it in a short test
the problem might only appear with some combinations of JVM / processor architectures

In your case, to make the test break (i.e. to observe y == 0) would require the program to see a partially constructed object where some fields have been properly constructed and some not. This typically does not happen on x86 / hotspot.

How to determine if a multithreaded code is broken?

The only way to prove that the code is valid or broken is to apply the JLS rules to it and see what the outcome is. With data race publishing (no synchronization around the publication of the object or of y), the JLS provides no guarantee that y will be seen as 4 (it could be seen with its default value of 0).

Can that code really break?

In practice, some JVMs will be better at making the test fail. For example some compilers (cf "A test case showing that it doesn't work" in this article) could transform TestClass.f = new TestClass(); into something like (because it is published via a data race):

(1) allocate memory
(2) write fields default values (x = 0; y = 0) //always first
(3) write final fields final values (x = 3)    //must happen before publication
(4) publish object                             //TestClass.f = new TestClass();
(5) write non final fields (y = 4)             //has been reodered after (4)

The JLS mandates that (2) and (3) happen before the object publication (4). However, due to the data race, no guarantee is given for (5) - it would actually be a legal execution if a thread never observed that write operation. With the proper thread interleaving, it is therefore conceivable that if reader runs between 4 and 5, you will get the desired output.

I don't have a symantec JIT at hand so can't prove it experimentally :-)

回复收藏 0 原文

万劫不复 2024-10-25 04:37:05

这里是默认的示例尽管构造函数设置了非最终值并且不会泄漏 this，但仍会观察到非最终值的值。这是基于我的其他问题有点复杂。我不断看到人们说这不可能发生在 x86 上，但我的示例发生在 x64 linux openjdk 6 上......

回复收藏 0 原文

累赘 2024-10-25 04:37:05

这是一个好问题，但答案很复杂。为了便于阅读，我将其分成几部分。

人们已经在这里说过很多次了，在 JLS 的严格规则下 - 您应该能够看到所需的行为。但是编译器（我的意思是C1和C2），虽然他们必须尊重JLS，但他们可以进行优化。我稍后会谈到这一点。

让我们采用第一个简单的场景，其中有两个非最终变量，看看我们是否可以发布不正确的对象。对于此测试，我使用的是专为此类量身定制的专用工具准确地进行测试。这是一个使用它的测试：

@Outcome(id = "0, 2", expect = Expect.ACCEPTABLE_INTERESTING, desc = "not correctly published")
@Outcome(id = "1, 0", expect = Expect.ACCEPTABLE_INTERESTING, desc = "not correctly published")
@Outcome(id = "1, 2", expect = Expect.ACCEPTABLE, desc = "published OK")
@Outcome(id = "0, 0", expect = Expect.ACCEPTABLE, desc = "II_Result default values for int, not interesting")
@Outcome(id = "-1, -1", expect = Expect.ACCEPTABLE, desc = "actor2 acted before actor1, this is OK")
@State
@JCStressTest
public class FinalTest {

    int x = 1;
    Holder h;

    @Actor
    public void actor1() {
        h = new Holder(x, x + 1);
    }

    @Actor
    public void actor2(II_Result result) {
        Holder local = h;
        // the other actor did it's job
        if (local != null) {
            // if correctly published, we can only see {1, 2} 
            result.r1 = local.left;
            result.r2 = local.right;
        } else {
            // this is the case to "ignore" default values that are
            // stored in II_Result object
            result.r1 = -1;
            result.r2 = -1;
        }
    }

    public static class Holder {

        // non-final
        int left, right;

        public Holder(int left, int right) {
            this.left = left;
            this.right = right;
        }
    }
}

你不必太了解代码；尽管最简单的解释是这样的：有两个 Actor 会改变一些共享数据，并且这些结果会被注册。 @Outcome 注释控制那些注册的结果并设置某些期望（在幕后，事情更加有趣和冗长）。请记住，这是一个非常锋利且专业的工具；你不能在两个线程运行的情况下做同样的事情。

现在，如果我运行这个，这两个结果：

 @Outcome(id = "0, 2", expect = Expect.ACCEPTABLE_INTERESTING....)
 @Outcome(id = "1, 0", expect = Expect.ACCEPTABLE_INTERESTING....)

将被观察到（这意味着存在一个不安全的对象发布，其他演员/线程实际上已经看到了）。

具体来说，这些是在所谓的 TC2 测试套件中观察到的，它们实际上是这样运行的：

java... -XX:-TieredCompilation 
        -XX:+UnlockDiagnosticVMOptions 
        -XX:+StressLCM 
        -XX:+StressGCM

我不会深入研究它们的作用，但是这是 StressLCM 和 StressGCM 的作用，当然，TieredCompilation 标志确实如此。

测试的全部要点是：

此代码证明构造函数中设置的两个非最终变量未正确发布并且在 x86 上运行。

现在要做的明智的事情是，因为有一个专门的工具，将单个字段更改为 final 并看到它被破坏。因此，改变这个并再次运行，我们应该观察失败：

public static class Holder {

    // this is the change
    final int right;
    int left;

    public Holder(int left, int right) {
        this.left = left;
        this.right = right;
    }
}

但是如果我们再次运行它，失败就不会出现。即我们上面讨论的两个@Outcome 都不会成为输出的一部分。怎么会？

事实证明当您写入单个最终变量时，JVM（特别是C1）将执行正确的事情，始终如此。 即使对于单个字段，这也是不可能证明的。至少目前是这样。

理论上，您可以将 Shenandoah 放入其中，它是一个有趣的标志：ShenandoahOptimizeInstanceFinals（不打算深入探讨它）。我尝试使用以下命令运行前面的示例：

 -XX:+UnlockExperimentalVMOptions  
 -XX:+UseShenandoahGC  
 -XX:+ShenandoahOptimizeInstanceFinals  
 -XX:-TieredCompilation  
 -XX:+UnlockDiagnosticVMOptions  
 -XX:+StressLCM  
 -XX:+StressGCM

但这并不像我希望的那样工作。对于我尝试这样做的论点来说，更糟糕的是这些标志将在 jdk-14 中删除。

底线：目前还没有办法打破这个。

This is a good question with a complicated answer. I've split it in pieces for an easier read.

People have said here enough times that under the strict rules of JLS - you should be able to see the desired behavior. But compilers (I mean C1 and C2), while they have to respect the JLS, they can make optimizations. And I will get to this later.

Let's take the first, easy scenario, where there are two non-final variables and see if we can publish an in-correct object. For this test, I am using a specialized tool that was tailored for this kind of tests exactly. Here is a test using it:

@Outcome(id = "0, 2", expect = Expect.ACCEPTABLE_INTERESTING, desc = "not correctly published")
@Outcome(id = "1, 0", expect = Expect.ACCEPTABLE_INTERESTING, desc = "not correctly published")
@Outcome(id = "1, 2", expect = Expect.ACCEPTABLE, desc = "published OK")
@Outcome(id = "0, 0", expect = Expect.ACCEPTABLE, desc = "II_Result default values for int, not interesting")
@Outcome(id = "-1, -1", expect = Expect.ACCEPTABLE, desc = "actor2 acted before actor1, this is OK")
@State
@JCStressTest
public class FinalTest {

    int x = 1;
    Holder h;

    @Actor
    public void actor1() {
        h = new Holder(x, x + 1);
    }

    @Actor
    public void actor2(II_Result result) {
        Holder local = h;
        // the other actor did it's job
        if (local != null) {
            // if correctly published, we can only see {1, 2} 
            result.r1 = local.left;
            result.r2 = local.right;
        } else {
            // this is the case to "ignore" default values that are
            // stored in II_Result object
            result.r1 = -1;
            result.r2 = -1;
        }
    }

    public static class Holder {

        // non-final
        int left, right;

        public Holder(int left, int right) {
            this.left = left;
            this.right = right;
        }
    }
}

You do not have to understand the code too much; though the very minimal explanations is this: there are two Actors that mutate some shared data and those results are registered. @Outcome annotations control those registered results and set certain expectations (under the hood things are far more interesting and verbose). Just bare in mind, this is a very sharp and specialized tool; you can't really do the same thing with two threads running.

Now, if I run this, the result in these two:

 @Outcome(id = "0, 2", expect = Expect.ACCEPTABLE_INTERESTING....)
 @Outcome(id = "1, 0", expect = Expect.ACCEPTABLE_INTERESTING....)

will be observed (meaning there was an unsafe publication of the Object, that the other Actor/Thread has actually see).

Specifically these are observed in the so-called TC2 suite of tests, and these are actually run like this:

java... -XX:-TieredCompilation 
        -XX:+UnlockDiagnosticVMOptions 
        -XX:+StressLCM 
        -XX:+StressGCM

I will not dive too much of what these do, but here is what StressLCM and StressGCM does and, of course, what TieredCompilation flag does.

The entire point of the test is that:

This code proves that two non-final variables set in the constructor are incorrectly published and that is run on x86.

The sane thing to do now, since there is a specialized tool in place, change a single field to final and see it break. As such, change this and run again, we should observe the failure:

public static class Holder {

    // this is the change
    final int right;
    int left;

    public Holder(int left, int right) {
        this.left = left;
        this.right = right;
    }
}

But if we run it again, the failure is not going to be there. i.e. none of the two @Outcome that we have talked above are going to be part of the output. How come?

It turns out that when you write even to a single final variable, the JVM (specifically C1) will do the correct thing, all the time. Even for a single field, as such this is impossible to demonstrate. At least at the moment.

In theory you could throw Shenandoah into this and it's interesting flag : ShenandoahOptimizeInstanceFinals (not going to dive into it). I have tried running previous example with:

 -XX:+UnlockExperimentalVMOptions  
 -XX:+UseShenandoahGC  
 -XX:+ShenandoahOptimizeInstanceFinals  
 -XX:-TieredCompilation  
 -XX:+UnlockDiagnosticVMOptions  
 -XX:+StressLCM  
 -XX:+StressGCM

but this does not work as I hoped it will. What is far worse for my arguments of even trying this, is that these flags are going to be removed in jdk-14.

Bottom-line: At the moment there is no way to break this.

回复收藏 0 原文

晒暮凉 2024-10-25 04:37:05

您修改构造函数来执行此操作怎么样：

public TestClass() {
 Thread.sleep(300);
   x = 3;
   y = 4;
}

我不是 JLF Finals 和初始化程序的专家，但常识告诉我，这应该延迟设置 x 足够长的时间，以便编写者注册另一个值？

What about you modified the constructor to do this:

public TestClass() {
 Thread.sleep(300);
   x = 3;
   y = 4;
}

I am not an expert on JLF finals and initializers, but common sense tells me this should delay setting x long enough for writers to register another value?

回复收藏 0 原文

永言不败 2024-10-25 04:37:05

如果把场景改成这样呢

public class TestClass {

    final int x;
    static TestClass f;

    public TestClass() {
        x = 3;
    }

    int y = 4;

    // etc...

}

？

What if one changes the scenario into

public class TestClass {

    final int x;
    static TestClass f;

    public TestClass() {
        x = 3;
    }

    int y = 4;

    // etc...

}

回复收藏 0 原文

浊酒尽余欢 2024-10-25 04:37:05

通过了解调用构造函数时实际发生的情况，可以更好地理解为什么此测试不会失败。 Java 是一种基于堆栈的语言。 TestClass.f = new TestClass(); 由四个动作组成。第一条new指令被调用，就像C/C++中的malloc一样，它分配内存并将对其的引用放置在堆栈顶部。然后复制引用以调用构造函数。事实上，构造函数就像任何其他实例方法一样，它是通过重复的引用来调用的。仅当该引用存储在方法框架或实例字段中并且可以从其他任何地方访问之后。在最后一步之前，对该对象的引用仅存在于创建线程堆栈的顶部，其他主体无法看到它。事实上，您使用的字段类型没有区别，如果 TestClass.f != null 则两者都将被初始化。您可以从不同的对象读取 x 和 y 字段，但这不会导致 y = 0。有关详细信息，您应该查看 JVM 规范和面向堆栈的编程语言文章。

UPD：我忘记提及一件重要的事情。通过java内存，没有办法看到部分初始化的对象。如果您不在构造函数内进行自我发布，当然可以。

JLS：

当一个对象被完全初始化时，它被认为是完全初始化的。
构造函数完成。只能看到对某个对象的引用的线程
保证该对象完全初始化后的对象
查看该对象最终的正确初始化值
字段。

JLS：

从构造函数的末尾有一个发生在边缘
对象到该对象的终结器的开始。

此观点的更广泛解释：

事实证明，对象构造函数的结束发生在之前
其finalize方法的执行。在实践中，这意味着
构造函数中发生的任何写入都必须完成并且
对终结器中同一变量的任何读取都可见，就像
这些变量是不稳定的。

UPD：这就是理论，让我们转向实践。

考虑以下代码，带有简单的非最终变量：

public class Test {

    int myVariable1;
    int myVariable2;

    Test() {
        myVariable1 = 32;
        myVariable2 = 64;
    }

    public static void main(String args[]) throws Exception {
        Test t = new Test();
        System.out.println(t.myVariable1 + t.myVariable2);
    }
}

以下命令显示由java生成的机器指令，如何使用它可以在维基：

java.exe -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -Xcomp
-XX:PrintAssemblyOptions=hsdis-print-bytes -XX:CompileCommand=print,*Test.main 测试

输出：

...
0x0263885d: movl   $0x20,0x8(%eax)    ;...c7400820 000000
                                    ;*putfield myVariable1
                                    ; - Test::<init>@7 (line 12)
                                    ; - Test::main@4 (line 17)
0x02638864: movl   $0x40,0xc(%eax)    ;...c7400c40 000000
                                    ;*putfield myVariable2
                                    ; - Test::<init>@13 (line 13)
                                    ; - Test::main@4 (line 17)
0x0263886b: nopl   0x0(%eax,%eax,1)   ;...0f1f4400 00
...

字段分配后跟 NOPL指令，其目的之一是防止指令重新排序。

为什么会发生这种情况？根据规范，终结在构造函数返回后发生。所以GC线程看不到部分初始化的对象。在 CPU 级别上，GC 线程与任何其他线程没有区别。如果向 GC 提供此类保证，则将向任何其他线程提供此类保证。这是针对这种限制的最明显的解决方案。

结果：

1）构造函数没有同步，同步由其他来完成说明。

2）在构造函数返回之前不能对对象的引用进行赋值。

Better understanding of why this test does not fail can come from understanding of what actually happens when constructor is invoked. Java is a stack-based language. TestClass.f = new TestClass(); consists of four action. First new instruction is called, its like malloc in C/C++, it allocates memory and places a reference to it on the top of the stack. Then reference is duplicated for invoking a constructor. Constructor in fact is like any other instance method, its invoked with the duplicated reference. Only after that reference is stored in the method frame or in the instance field and becomes accessible from anywhere else. Before the last step reference to the object is present only on the top of creating thread's stack and no body else can see it. In fact there is no difference what kind of field you are working with, both will be initialized if TestClass.f != null. You can read x and y fields from different objects, but this will not result in y = 0. For more information you should see JVM Specification and Stack-oriented programming language articles.

UPD: One important thing I forgot to mention. By java memory there is no way to see partially initialized object. If you do not do self publications inside constructor, sure.

JLS:

An object is considered to be completely initialized when its
constructor finishes. A thread that can only see a reference to an
object after that object has been completely initialized is guaranteed
to see the correctly initialized values for that object's final
fields.

JLS:

There is a happens-before edge from the end of a constructor of an
object to the start of a finalizer for that object.

Broader explanation of this point of view:

It turns out that the end of an object's constructor happens-before
the execution of its finalize method. In practice, what this means is
that any writes that occur in the constructor must be finished and
visible to any reads of the same variable in the finalizer, just as if
those variables were volatile.

UPD: That was the theory, let's turn to practice.

Consider the following code, with simple non-final variables:

public class Test {

    int myVariable1;
    int myVariable2;

    Test() {
        myVariable1 = 32;
        myVariable2 = 64;
    }

    public static void main(String args[]) throws Exception {
        Test t = new Test();
        System.out.println(t.myVariable1 + t.myVariable2);
    }
}

The following command displays machine instructions generated by java, how to use it you can find in a wiki:

java.exe -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -Xcomp
-XX:PrintAssemblyOptions=hsdis-print-bytes -XX:CompileCommand=print,*Test.main Test

It's output:

...
0x0263885d: movl   $0x20,0x8(%eax)    ;...c7400820 000000
                                    ;*putfield myVariable1
                                    ; - Test::<init>@7 (line 12)
                                    ; - Test::main@4 (line 17)
0x02638864: movl   $0x40,0xc(%eax)    ;...c7400c40 000000
                                    ;*putfield myVariable2
                                    ; - Test::<init>@13 (line 13)
                                    ; - Test::main@4 (line 17)
0x0263886b: nopl   0x0(%eax,%eax,1)   ;...0f1f4400 00
...

Field assignments are followed by NOPL instruction, one of it's purposes is to prevent instruction reordering.

Why does this happen? According to specification finalization happens after constructor returns. So GC thread cant see a partially initialized object. On a CPU level GC thread is not distinguished from any other thread. If such guaranties are provided to GC, than they are provided to any other thread. This is the most obvious solution to such restriction.

Results:

1) Constructor is not synchronized, synchronization is done by other instructions.

2) Assignment to object's reference cant happen before constructor returns.

回复收藏 0 原文

半步萧音过轻尘 2024-10-25 04:37:05

这个线程中发生了什么？为什么该代码首先会失败？

您启动 1000 个线程，每个线程将执行以下操作：

TestClass.f = new TestClass();

按顺序执行以下操作：

评估 TestClass.f 以找出其内存位置
评估 new TestClass()：this创建 TestClass 的一个新实例，其构造函数将初始化 x 和 y
将右侧值分配给左侧内存位置

赋值是一个原子操作生成右侧值后始终执行的操作。这里引用 Java 语言规范（请参阅第一个要点），但它确实适用于任何理智的语言。

这意味着，虽然 TestClass() 构造函数正在花时间完成其工作，并且 x 和 y 可能仍为零，但对部分初始化的 TestClass 对象的引用仅存在于该线程的堆栈或 CPU 寄存器中，并且尚未写入 TestClass.f

因此 TestClass.f 将始终包含：

在程序开始时、在为其分配任何其他内容之前的 null，
或者完全初始化的 TestClass 实例。

What's going on in this thread? Why should that code fail in the first place?

You launch 1000s of threads that will each do the following:

TestClass.f = new TestClass();

What that does, in order:

evaluate TestClass.f to find out its memory location
evaluate new TestClass(): this creates a new instance of TestClass, whose constructor will initialize both x and y
assign the right-hand value to the left-hand memory location

An assignment is an atomic operation which is always performed after the right-hand value has been generated. Here is a citation from the Java language spec (see the first bulleted point) but it really applies to any sane language.

This means that while the TestClass() constructor is taking its time to do its job, and x and y could conceivably still be zero, the reference to the partially initialized TestClass object only lives in that thread's stack, or CPU registers, and has not been written to TestClass.f

Therefore TestClass.f will always contain: