为什么我的 System.nanoTime() 坏了?

发布于 2024-09-11 00:12:57 字数 2129 浏览 4 评论 0原文

我自己和当时的另一位开发人员最近从工作中的 Core 2 Duo 机器转移到了新的 Core 2 Quad 9505;两者都运行 Windows XP SP3 32 位和 JDK 1.6.0_18。

这样做后,我们对某些计时/统计/指标聚合代码的一些自动化单元测试立即开始失败,因为从 System.nanoTime() 返回的值似乎很荒谬。

在我的机器上可靠地显示此行为的测试代码是:

import static org.junit.Assert.assertThat;

import org.hamcrest.Matchers;
import org.junit.Test;

public class NanoTest {

  @Test
  public void testNanoTime() throws InterruptedException {
    final long sleepMillis = 5000;

    long nanosBefore = System.nanoTime();
    long millisBefore = System.currentTimeMillis();

    Thread.sleep(sleepMillis);

    long nanosTaken = System.nanoTime() - nanosBefore;
    long millisTaken = System.currentTimeMillis() - millisBefore;

    System.out.println("nanosTaken="+nanosTaken);
    System.out.println("millisTaken="+millisTaken);

    // Check it slept within 10% of requested time
    assertThat((double)millisTaken, Matchers.closeTo(sleepMillis, sleepMillis * 0.1));
    assertThat((double)nanosTaken, Matchers.closeTo(sleepMillis * 1000000, sleepMillis * 1000000 * 0.1));
  }

}

典型输出:

millisTaken=5001
nanosTaken=2243785148

运行 100 倍会产生实际睡眠时间 33% 到 60% 之间的纳米结果;但通常在 40% 左右。

我了解 Windows 中计时器准确性的弱点,并阅读了相关线程,例如 Is System. nanoTime() 跨线程一致?,但是我的理解是 System.nanoTime() 的目的正是我们使用它的目的:- 测量经过的时间;比 currentTimeMillis() 更准确。

有谁知道为什么它会返回如此疯狂的结果?这可能是硬件架构问题(唯一发生变化的是这台机器上的CPU/主板)?我当前硬件的 Windows HAL 有问题吗? JDK 问题?我应该放弃 nanoTime() 吗?我应该在某处记录错误,或者对如何进一步调查有任何建议吗?

更新 19/07 03:15 UTC:在尝试下面 finnw 的测试用例后,我做了一些更多的谷歌搜索,遇到了诸如 bugid:6440250。这也让我想起了周五晚些时候我注意到的其他一些奇怪的行为,其中 ping 值变为负数。所以我将 /usepmtimer 添加到我的 boot.ini 中,现在所有测试都按预期运行,并且我的 ping 也正常。

我有点困惑为什么这仍然是一个问题;根据我的阅读,我认为 TSC 与 PMT 问题在 Windows XP SP3 中已基本得到解决。难道是因为我的机器本来是SP2的,后来打补丁到了SP3,而不是原来安装的SP3?我现在还想知道是否应该安装 MS KB896256 上的补丁。也许我应该与公司桌面构建团队一起解决这个问题?

Myself and another developer on my time recently moved from a Core 2 Duo machine at work to a new Core 2 Quad 9505; both running Windows XP SP3 32-bit with JDK 1.6.0_18.

Upon doing so, a couple of our automated unit tests for some timing/statistics/metrics aggregation code promptly started failing, due to what appear to be ridiculous values coming back from System.nanoTime().

Test code that shows this behaviour, reliably, on my machine is:

import static org.junit.Assert.assertThat;

import org.hamcrest.Matchers;
import org.junit.Test;

public class NanoTest {

  @Test
  public void testNanoTime() throws InterruptedException {
    final long sleepMillis = 5000;

    long nanosBefore = System.nanoTime();
    long millisBefore = System.currentTimeMillis();

    Thread.sleep(sleepMillis);

    long nanosTaken = System.nanoTime() - nanosBefore;
    long millisTaken = System.currentTimeMillis() - millisBefore;

    System.out.println("nanosTaken="+nanosTaken);
    System.out.println("millisTaken="+millisTaken);

    // Check it slept within 10% of requested time
    assertThat((double)millisTaken, Matchers.closeTo(sleepMillis, sleepMillis * 0.1));
    assertThat((double)nanosTaken, Matchers.closeTo(sleepMillis * 1000000, sleepMillis * 1000000 * 0.1));
  }

}

Typical output:

millisTaken=5001
nanosTaken=2243785148

Running it 100x yields nano results between 33% and 60% of the actual sleep time; usually around 40% though.

I understand the weaknesses in accuracy of timers in Windows, and have read related threads like Is System.nanoTime() consistent across threads?, however my understanding is that System.nanoTime() is intended for exactly the purpose we're using it :- measuring elapsed time; more accurately than currentTimeMillis().

Does anyone know why it's returning such crazy results? Is this likely to be a hardware architecture problem (the only major thing that has changed is the CPU/Motherboard on this machine)? A problem with the Windows HAL with my current hardware? A JDK problem? Should I abandon nanoTime()? Should I log a bug somewhere, or any suggestions on how I could investigate further?

UPDATE 19/07 03:15 UTC: After trying finnw's test case below I did some more Googling, coming across entries such as bugid:6440250. It also reminded me of some other strange behaviour I noticed late Friday where pings were coming back negative. So I added /usepmtimer to my boot.ini and now all the tests behave as expected., and my pings are normal too.

I'm a bit confused about why this was still an issue though; from my reading I thought TSC vs PMT issues were largely resolved in Windows XP SP3. Could it be because my machine was originally SP2, and was patched to SP3 rather than installed originally as SP3? I now also wonder whether I should be installing patches like the one at MS KB896256. Maybe I should take this up with the corporate desktop build team?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

没有伤那来痛 2024-09-18 00:12:57

通过将 /usepmtimer 添加到我的 C:\boot.ini字符串;强制 Windows 使用电源管理计时器而不是 TSC。鉴于我使用的是 XP SP3,为什么我需要这样做是一个悬而未决的问题,因为我知道这是默认设置,但也许这是由于我的计算机修补到 SP3 的方式所致。

The problem was resolved (with some open suspicions about the suitability of nanoTime() on multi-core systems!) by adding /usepmtimer to the end of my C:\boot.ini string; forcing Windows to use the Power Management timer rather than the TSC. It's an open question as to why I needed to do this given I'm on XP SP3, as I understood that this was the default, however perhaps it was due to the manner in which my machine was patched to SP3.

半﹌身腐败 2024-09-18 00:12:57

在我的系统(Windows 7 64 位、Core i7 980X)上:

nanosTaken=4999902563
millisTaken=5001

System.nanoTime() 使用特定于操作系统的调用,因此我预计您会在 Windows/处理器组合中看到错误。

On my system (Windows 7 64-Bit, Core i7 980X):

nanosTaken=4999902563
millisTaken=5001

System.nanoTime() uses OS-specific calls, so I expect that you are seeing a bug in your Windows/processor combination.

第七度阳光i 2024-09-18 00:12:57

您可能想阅读其他堆栈溢出问题的答案: Is System.nanoTime() 完全没用?

总之,nanoTime 依赖于操作系统计时器,而该计时器可能会受到多核 CPU 存在的影响。因此,nanoTime 在某些操作系统和 CPU 组合上可能没有那么有用,在打算在多个目标平台上运行的可移植 Java 代码中使用它时应小心。网络上似乎有很多关于这个问题的抱怨,但对于有意义的替代方案却没有太多共识。

You probably want to read the answers to this other stack overflow question: Is System.nanoTime() completely useless?.

In summary, it would appear that nanoTime relies on operating system timers that may be affected by the presence of multiple core CPUs. As such, nanoTime may not be that useful on certain combinations of OS and CPU, and care should be taken when using it in portable Java code that you intend to run on multiple target platforms. There seems to be a lot of complaining on the web on this subject, but not much consensus on a meaningful alternative.

弄潮 2024-09-18 00:12:57

很难判断这是一个错误还是内核之间正常的计时器变化。

您可以尝试的一个实验是使用本机调用来强制线程在特定核心上运行。

另外,为了排除电源管理影响,请尝试在循环中旋转作为 sleep() 的替代方案:

import com.sun.jna.Native;
import com.sun.jna.NativeLong;
import com.sun.jna.platform.win32.Kernel32;
import com.sun.jna.platform.win32.W32API;

public class AffinityTest {

    private static void testNanoTime(boolean sameCore, boolean spin)
    throws InterruptedException {
        W32API.HANDLE hThread = kernel.GetCurrentThread();
        final long sleepMillis = 5000;

        kernel.SetThreadAffinityMask(hThread, new NativeLong(1L));
        Thread.yield();
        long nanosBefore = System.nanoTime();
        long millisBefore = System.currentTimeMillis();

        kernel.SetThreadAffinityMask(hThread, new NativeLong(sameCore? 1L: 2L));
        if (spin) {
            Thread.yield();
            while (System.currentTimeMillis() - millisBefore < sleepMillis)
                ;
        } else {
            Thread.sleep(sleepMillis);
        }

        long nanosTaken = System.nanoTime() - nanosBefore;
        long millisTaken = System.currentTimeMillis() - millisBefore;

        System.out.println("nanosTaken="+nanosTaken);
        System.out.println("millisTaken="+millisTaken);
    }

    public static void main(String[] args) throws InterruptedException {
        System.out.println("Sleeping, different cores");
        testNanoTime(false, false);
        System.out.println("\nSleeping, same core");
        testNanoTime(true, false);
        System.out.println("\nSpinning, different cores");
        testNanoTime(false, true);
        System.out.println("\nSpinning, same core");
        testNanoTime(true, true);
    }

    private static final Kernel32Ex kernel =
        (Kernel32Ex) Native.loadLibrary(Kernel32Ex.class);

}

interface Kernel32Ex extends Kernel32 {
    NativeLong SetThreadAffinityMask(HANDLE hThread, NativeLong dwAffinityMask);
}

如果根据内核选择您得到非常不同的结果(例如,同一内核上为 5000ms,但在不同内核上为 2200ms)核心),这表明问题只是核心之间的自然计时器变化。

如果睡眠与旋转得到的结果截然不同,则更有可能是由于电源管理减慢了时钟速度。

如果四个结果中没有一个接近 5000 毫秒,则可能是一个错误。

It is difficult to tell whether this is a bug or just normal timer variation between cores.

An experiment you could try is to use native calls to force the thread to run on a specific core.

Also, to rule out power management effects, try spinning in a loop as an alternative to sleep():

import com.sun.jna.Native;
import com.sun.jna.NativeLong;
import com.sun.jna.platform.win32.Kernel32;
import com.sun.jna.platform.win32.W32API;

public class AffinityTest {

    private static void testNanoTime(boolean sameCore, boolean spin)
    throws InterruptedException {
        W32API.HANDLE hThread = kernel.GetCurrentThread();
        final long sleepMillis = 5000;

        kernel.SetThreadAffinityMask(hThread, new NativeLong(1L));
        Thread.yield();
        long nanosBefore = System.nanoTime();
        long millisBefore = System.currentTimeMillis();

        kernel.SetThreadAffinityMask(hThread, new NativeLong(sameCore? 1L: 2L));
        if (spin) {
            Thread.yield();
            while (System.currentTimeMillis() - millisBefore < sleepMillis)
                ;
        } else {
            Thread.sleep(sleepMillis);
        }

        long nanosTaken = System.nanoTime() - nanosBefore;
        long millisTaken = System.currentTimeMillis() - millisBefore;

        System.out.println("nanosTaken="+nanosTaken);
        System.out.println("millisTaken="+millisTaken);
    }

    public static void main(String[] args) throws InterruptedException {
        System.out.println("Sleeping, different cores");
        testNanoTime(false, false);
        System.out.println("\nSleeping, same core");
        testNanoTime(true, false);
        System.out.println("\nSpinning, different cores");
        testNanoTime(false, true);
        System.out.println("\nSpinning, same core");
        testNanoTime(true, true);
    }

    private static final Kernel32Ex kernel =
        (Kernel32Ex) Native.loadLibrary(Kernel32Ex.class);

}

interface Kernel32Ex extends Kernel32 {
    NativeLong SetThreadAffinityMask(HANDLE hThread, NativeLong dwAffinityMask);
}

If you get very different results depending on core selection (e.g. 5000ms on the same core but 2200ms on different cores) that would suggest that the problem is just natural timer variation between cores.

If you get very different results from sleeping vs. spinning, it is more likely due to power management slowing down the clocks.

If none of the four results are close to 5000ms, then it might be a bug.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文