精确测量线程中代码的执行时间(C#)

发布于 2024-12-07 09:00:47 字数 2092 浏览 0 评论 0原文

我试图在多个线程上尽可能准确地测量某些代码的执行时间,同时考虑上下文切换和线程停机时间。该应用程序是用 C# (VS 2008) 实现的。示例:

public void ThreadFunc ()
{
    // Some code here

    // Critical block #1 begins here
    long lTimestamp1 = Stopwatch.GetTimestamp ();

    CallComplex3rdPartyFunc (); // A

    long lTimestamp2 = Stopwatch.GetTimestamp ();
    // Critical block #1 ends here

    // Some code here

    // Critical block #2 begins here
    long lTimestamp3 = Stopwatch.GetTimestamp ();

    CallOtherComplex3rdPartyFunc (); // B

    long lTimestamp4 = Stopwatch.GetTimestamp ();
    // Critical block #2 ends here

    // Save timestamps for future analysis.
}

public int Main ( string[] sArgs )
{
    // Some code here

    int nCount = SomeFunc ();

    for ( int i = 0; i < nCount; i++ )
    {
        Thread oThread = new Thread ( ThreadFunc );
        oThread.Start ();
    }

    // Some code here

    return ( 0 );
}

我想尽可能准确地测量上述两个关键代码块的执行时间。标记为 AB 的两个调用可能是长函数调用,有时可能需要几秒钟才能执行,但在某些情况下可能会在几毫秒内完成。

我在多个线程上运行上述代码 - 1 到 200 个线程之间,具体取决于用户输入。运行此代码的计算机有 2-16 个核心 - 用户在较弱的计算机上使用较低的线程数。

问题是 AB 都是潜在的长函数,因此在它们的执行过程中很可能至少会发生一次上下文切换 - 可能不止一次。因此代码获取lTimestamp1,然后另一个线程开始执行(并且当前线程等待)。最终当前线程重新获得控制权并检索 lTimestamp2。

这意味着 lTimestamp1lTimestamp2 之间的持续时间包括线程实际未运行的时间 - 当其他线程执行时,它正在等待再次调度。然而,滴答计数无论如何都会增加,所以持续时间现在实际上是

代码块时间 = A + B + 在其他线程中花费的时间

而我希望它只是

代码块时间 = A + B

对于大量线程来说,这尤其是一个问题,因为它们都有机会运行,因此上述计时会更高,而所有其他线程在相关线程获得另一次机会之前运行运行。

所以我的问题是:是否可以以某种方式计算线程未运行的时间,然后相应地调整上述时间?我想完全或至少尽可能多地消除(减去)第三项。该代码运行了数百万次,因此最终的时间是根据大量样本计算出来的,然后取平均值。

我不是在寻找探查器产品等 - 应用程序需要尽可能准确地对这些标记的部分进行计时。函数 AB 是第三方函数,我无法以任何方式更改它们。我还意识到以纳秒精度测量时间时可能出现的波动以及这些第三方函数内部可能存在的开销,但我仍然需要进行此测量。

任何建议将不胜感激 - C++ 或 x86 汇编代码也可以工作。

编辑:似乎不可能实现这一点。 Scott 下面的想法(使用 GetThreadTimes)很好,但不幸的是 GetThreadTimes() 是一个有缺陷的 API,它几乎从不返回正确的数据。感谢您的所有回复!

I'm trying to measure the execution time of some bits of code as accurately as possible on a number of threads, taking context switching and thread downtime into account. The application is implemented in C# (VS 2008). Example:

public void ThreadFunc ()
{
    // Some code here

    // Critical block #1 begins here
    long lTimestamp1 = Stopwatch.GetTimestamp ();

    CallComplex3rdPartyFunc (); // A

    long lTimestamp2 = Stopwatch.GetTimestamp ();
    // Critical block #1 ends here

    // Some code here

    // Critical block #2 begins here
    long lTimestamp3 = Stopwatch.GetTimestamp ();

    CallOtherComplex3rdPartyFunc (); // B

    long lTimestamp4 = Stopwatch.GetTimestamp ();
    // Critical block #2 ends here

    // Save timestamps for future analysis.
}

public int Main ( string[] sArgs )
{
    // Some code here

    int nCount = SomeFunc ();

    for ( int i = 0; i < nCount; i++ )
    {
        Thread oThread = new Thread ( ThreadFunc );
        oThread.Start ();
    }

    // Some code here

    return ( 0 );
}

I'd like to measure the execution time of the above two critical code blocks as accurately as possible. The two calls marked as A and B are potentially long function calls that may sometimes take several seconds to execute but in some cases they may complete in a few milliseconds.

I'm running the above code on a number of threads - somewhere between 1 to 200 threads, depending on user input. The computers running this code have 2-16 cores - users use lower thread counts on the weaker machines.

The problem is that A and B are both potenitally long functions so it's very likely that at least one context switch will happen during their execution - possibly more than one. So the code gets lTimestamp1, then another thread starts executing (and the current thread waits). Eventually the current thread gets back control and retrieves lTimestamp2.

This means that the duration between lTimestamp1 and lTimestamp2 includes time when the thread was not actually running - it was waiting to be scheduled again while other threads executed. The tick count, however, increases anyway, so the duration is now really

Code block time = A + B + some time spent in other threads

while I want it to be only

Code block time = A + B

This is especially an issue with a larger number of threads, since they'll all get a chance to run, so the above timings will be higher while all other threads run before the thread in question gets another chance to run.

So my question is: is it possible to somehow calculate the time when the thread is not running and then adjust the above timings accordingly? I'd like to eliminate (subtract) that 3rd term entirely or at least as much of it as possible. The code runs millions of times, so final timings are calculated from a lot of samples and then averaged out.

I'm not looking for profiler products, etc. - the application needs to time these the marked parts as accurately as possible. The functions A and B are 3rd party functions, I cannot change them in any way. I'm also aware of the possible fluctuations when measuring time with nanosecond precision and possible overhead inside those 3rd-party functions, but I still need to do this measurement.

Any advice would be greatly appreciated - C++ or x86 assembly code would work as well.

Edit: seems to be impossible to implement this. Scott's idea below (using GetThreadTimes) is good but unfortunately GetThreadTimes() is a flawed API and it almost never returns correct data. Thanks for all the replies!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

第七度阳光i 2024-12-14 09:00:47

这可以通过本机 API 调用 GetThreadTimes 来完成。这是一篇使用它的文章 CodeProject

第二种选择是使用 QueryThreadCycleTime。这不会给您时间,但会给您当前线程已执行的周期数。

请注意,您不能直接转换周期->秒,因为许多处理器(尤其是移动处理器)并不以固定速度运行,因此没有可以乘以的常数获取经过的时间(以秒为单位)。但是,如果您使用的处理器不改变其速度,那么从周期中获取挂钟时间将是一个简单的数学问题。

This can be done with the Native API call GetThreadTimes. Here is a article on CodeProject that uses it.

A second option is use QueryThreadCycleTime. This will not give you the time, but it will give you the number of cycles the current thread has been executing.

Be aware you can't just directly convert cycles->seconds due to the fact that many processors (especially mobile processors) do not run at a fixed speed so there is no constant number you could multiply by to get the elapsed time in seconds. But if you are using a processor that does not vary its speed it then would be a simple math problem to get wall clock time from the cycles.

清眉祭 2024-12-14 09:00:47

您可以使用 Stopwatch.Start()Stopwatch.Stop() 方法暂停/继续时间测量,它不会重置已过去/ElapsedMilliseconds 值,因此也许您可以利用它。

关于线程上下文切换 - 我相信没有办法在托管代码中处理它,因此不可能排除线程挂起的时间

编辑:

一篇带有基准的有趣文章:进行上下文切换需要多长时间?

You can use Stopwatch.Start() and Stopwatch.Stop() methods to pause/continue time measurement, it does not reset Elapsed/ElapsedMilliseconds value so perhaps you can leverage this.

Regarding thread context switches - I believe there are no ways to handle it in managed code so this is not possible to exclude time when thread was suspended

EDIT:

An interesting article with benchmarks: How long does it take to make a context switch?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文