线程间通信时间

发布于 2024-08-04 10:03:33 字数 4305 浏览 13 评论 0原文

我通过端口和接收器将 15 个异步操作链接在一起。这让我非常关心线程间消息传递时间，特别是任务将数据发送到端口和新任务开始在不同线程上处理相同数据之间所需的时间。假设每个线程在启动时处于空闲状态的最佳情况，我生成了一个测试，该测试使用秒表类来测量两个不同的调度程序的时间，每个调度程序都以单个线程的最高优先级运行。

让我惊讶的是，我的开发设备是运行 Windows 7 x64 的 Q6600 四核 2.4 Ghz 计算机，测试的平均上下文切换时间为 5.66 微秒，标准差为 5.738 微秒，最大值接近 1.58 毫秒（ 282 的因数！）。秒表频率是 427.7 纳秒，所以我仍然没有受到传感器噪音的影响。

我想做的是尽可能减少线程间消息传递时间，同样重要的是，减少上下文切换的标准偏差。我意识到 Windows 不是实时操作系统，并且没有保证，但是 Windows 调度程序是一个基于公平循环优先级的调度，并且此测试中的两个线程都处于最高优先级（唯一应该是该线程的线程）高），所以线程上不应该有任何上下文切换（从最大时间 1.58 毫秒可以看出......我相信 Windows 量子是 15.65 毫秒？）我唯一能想到的是操作系统调用时间的变化CCR 用于在线程之间传递消息的锁定机制。

如果其他人测量了线程间消息传递时间，并对如何改进它有任何建议，请告诉我。

这是我测试的源代码：

using System;
using System.Collections.Generic;
using System.IO;
using System.Threading;
using Microsoft.Ccr.Core;

using System.Diagnostics;

namespace Test.CCR.TestConsole
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Starting Timer");
            var sw = new Stopwatch();
            sw.Start();

            var dispatcher = new Dispatcher(1, ThreadPriority.Highest, true, "My Thread Pool");
            var dispQueue = new DispatcherQueue("Disp Queue", dispatcher);

            var sDispatcher = new Dispatcher(1, ThreadPriority.Highest, true, "Second Dispatcher");
            var sDispQueue = new DispatcherQueue("Second Queue", sDispatcher);

            var legAPort = new Port<EmptyValue>();
            var legBPort = new Port<TimeSpan>();

            var distances = new List<double>();

            long totalTicks = 0;

            while (sw.Elapsed.TotalMilliseconds < 5000) ;

            int runCnt = 100000;
            int offset = 1000;

            Arbiter.Activate(dispQueue, Arbiter.Receive(true, legAPort, i =>
                                                                            {
                                                                                TimeSpan sTime = sw.Elapsed;
                                                                                legBPort.Post(sTime);
                                                                            }));
            Arbiter.Activate(sDispQueue, Arbiter.Receive(true, legBPort, i =>
                                                                             {
                                                                                 TimeSpan eTime = sw.Elapsed;
                                                                                 TimeSpan dt = eTime.Subtract(i);
                                                                                 //if (distances.Count == 0 || Math.Abs(distances[distances.Count - 1] - dt.TotalMilliseconds) / distances[distances.Count - 1] > 0.1)
                                                                                 distances.Add(dt.TotalMilliseconds);

                                                                                 if(distances.Count > offset)
                                                                                 Interlocked.Add(ref totalTicks,
                                                                                                 dt.Ticks);
                                                                                 if(distances.Count < runCnt)
                                                                                     legAPort.Post(EmptyValue.SharedInstance);
                                                                             }));


            //Thread.Sleep(100);
            legAPort.Post(EmptyValue.SharedInstance);

            Thread.Sleep(500);

            while (distances.Count < runCnt)
                Thread.Sleep(25);

            TimeSpan exTime = TimeSpan.FromTicks(totalTicks);
            double exMS = exTime.TotalMilliseconds / (runCnt - offset);

            Console.WriteLine("Exchange Time: {0} Stopwatch Resolution: {1}", exMS, Stopwatch.Frequency);

            using(var stw = new StreamWriter("test.csv"))
            {
                for(int ix=0; ix < distances.Count; ix++)
                {
                    stw.WriteLine("{0},{1}", ix, distances[ix]);
                }
                stw.Flush();
            }

            Console.ReadKey();
        }
    }
}

原文

I am chaining together 15 async operations through ports and receivers. This has left me very concerned with the interthread messaging time, specifically the time it takes between a task posting data to a port, and a new task begins processing that same data on a different thread. Assuming best case situation where each thread is idle at start, I have generated a test which uses the stop watch class to measure the time from two different dispatchers each operating at highest priority with a single thread.

What I found surprised me, my development rig is a Q6600 Quad Core 2.4 Ghz computer running Windows 7 x64, and the average context switch time from my test was 5.66 microseconds with a standard deviation of 5.738 microseconds, and a maximum of nearly 1.58 milliseconds (a factor of 282!). The Stopwatch Frequency is 427.7 nano seconds, so I am still well out of sensor noise.

What I would like to do is reduce the interthread messaging time as much as possible, and equally important, reduce the standard deviation of the context switch. I realize Windows is not a Real Time OS, and there are not guarantees, but the windows scheduler is a fair round robin priority based schedule, and the two threads in this test are both at the highest priority (the only threads that should be that high), so there should not be any context switches on the threads (evident by the 1.58 ms largest time... I believe windows quanta is 15.65 ms?) The only thing I can think of is variation in the timing of the OS calls to the locking mechanisms used by the CCR to pass messages between threads.

Please let me know if anyone else out there has measured interthread messaging time, and has any suggestions on how to improve it.

Here is the source code from my tests:

using System;
using System.Collections.Generic;
using System.IO;
using System.Threading;
using Microsoft.Ccr.Core;

using System.Diagnostics;

namespace Test.CCR.TestConsole
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Starting Timer");
            var sw = new Stopwatch();
            sw.Start();

            var dispatcher = new Dispatcher(1, ThreadPriority.Highest, true, "My Thread Pool");
            var dispQueue = new DispatcherQueue("Disp Queue", dispatcher);

            var sDispatcher = new Dispatcher(1, ThreadPriority.Highest, true, "Second Dispatcher");
            var sDispQueue = new DispatcherQueue("Second Queue", sDispatcher);

            var legAPort = new Port<EmptyValue>();
            var legBPort = new Port<TimeSpan>();

            var distances = new List<double>();

            long totalTicks = 0;

            while (sw.Elapsed.TotalMilliseconds < 5000) ;

            int runCnt = 100000;
            int offset = 1000;

            Arbiter.Activate(dispQueue, Arbiter.Receive(true, legAPort, i =>
                                                                            {
                                                                                TimeSpan sTime = sw.Elapsed;
                                                                                legBPort.Post(sTime);
                                                                            }));
            Arbiter.Activate(sDispQueue, Arbiter.Receive(true, legBPort, i =>
                                                                             {
                                                                                 TimeSpan eTime = sw.Elapsed;
                                                                                 TimeSpan dt = eTime.Subtract(i);
                                                                                 //if (distances.Count == 0 || Math.Abs(distances[distances.Count - 1] - dt.TotalMilliseconds) / distances[distances.Count - 1] > 0.1)
                                                                                 distances.Add(dt.TotalMilliseconds);

                                                                                 if(distances.Count > offset)
                                                                                 Interlocked.Add(ref totalTicks,
                                                                                                 dt.Ticks);
                                                                                 if(distances.Count < runCnt)
                                                                                     legAPort.Post(EmptyValue.SharedInstance);
                                                                             }));


            //Thread.Sleep(100);
            legAPort.Post(EmptyValue.SharedInstance);

            Thread.Sleep(500);

            while (distances.Count < runCnt)
                Thread.Sleep(25);

            TimeSpan exTime = TimeSpan.FromTicks(totalTicks);
            double exMS = exTime.TotalMilliseconds / (runCnt - offset);

            Console.WriteLine("Exchange Time: {0} Stopwatch Resolution: {1}", exMS, Stopwatch.Frequency);

            using(var stw = new StreamWriter("test.csv"))
            {
                for(int ix=0; ix < distances.Count; ix++)
                {
                    stw.WriteLine("{0},{1}", ix, distances[ix]);
                }
                stw.Flush();
            }

            Console.ReadKey();
        }
    }
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

鸩远一方 2024-08-11 10:03:33

Windows 不是实时操作系统。但你已经知道了。让你丧命的是上下文切换时间，而不一定是消息时间。您并没有真正指定进程间通信的工作原理。如果您确实只是运行多个线程，那么不使用 Windows 消息作为通信协议，您会发现一些好处，而是尝试使用应用程序托管的消息队列来滚动您自己的 IPC。

当上下文切换发生时，对于任何版本的 Windows，您可以期望的最佳平均值是 1 毫秒。当您的应用程序必须屈服于内核时，您可能会看到 1ms 的时间。这是为 Ring-1 应用程序（用户空间）设计的。如果绝对重要的是低于 1 毫秒，则需要将某些应用程序切换到 Ring-0，这意味着编写设备驱动程序。

设备驱动程序不会遇到与用户应用程序相同的上下文切换时间，并且还可以访问纳秒分辨率计时器和睡眠调用。如果您确实需要这样做，可以从 Microsoft 免费获得 DDK（设备驱动程序开发套件），但我强烈建议您投资购买第 3 方开发套件。他们通常有非常好的示例和大量向导来正确设置，这需要您花费数月的时间阅读 DDK 文档才能发现。您还需要像 SoftIce 这样的东西，因为普通的 Visual Studio 调试器不会帮助您调试设备驱动程序。

回复收藏 0 原文

很酷又爱笑 2024-08-11 10:03:33

这 15 个异步操作必须是异步的吗？即，您是否因某些库的限制而被迫以这种方式进行操作，或者您可以选择进行同步调用？

如果可以选择，则需要构建应用程序，以便通过配置参数控制异步性的使用。在不同线程上返回的异步操作与在同一线程上返回的同步操作之间的差异在代码中应该是透明的。这样你就可以在不改变代码结构的情况下调整它。

“令人尴尬的并行”一词描述了一种算法，其中正在完成的大部分工作显然是独立的，因此可以按任何顺序完成，从而易于并行化。

但是您“通过端口和接收器将 15 个异步操作链接在一起”。这可以用“尴尬的顺序”来形容。换句话说，同一个程序在逻辑上可以在单个线程上编写。但是，您将失去异步操作之间发生的 CPU 密集型工作的任何并行性（假设有任何重要意义）。

如果您编写一个简单的测试来删除任何重要的 CPU 密集型工作并仅测量上下文切换时间，那么您猜怎么着，您将测量上下文切换时间的变化，正如您所发现的那样。

运行多个线程的唯一原因是 CPU 需要完成大量工作，因此您希望在多个 CPU 之间共享这些工作。如果各个计算块的生命周期足够短，那么上下文切换对于任何操作系统都将是一个巨大的开销。通过将计算分为 15 个阶段，每个阶段都很短，您实际上是在要求操作系统进行大量不必要的上下文切换。

回复收藏 0 原文

假装不在乎 2024-08-11 10:03:33

ThreadPriority.Highest 并不意味着只有线程调度程序本身具有更高的优先级。 Win32 API 具有更细粒度的线程优先级 ( clicky），具有高于最高级别的多个级别（IIRC最高通常是非管理代码可以运行的最高优先级，管理员可以像任何硬件驱动程序/内核模式代码一样安排更高的优先级），因此不能保证他们会不被先发制人。

即使一个线程以最高优先级运行，如果其他线程持有更高优先级线程所需的资源锁，窗口也可以将其他线程提升到其基本优先级以上，这也是您可能会遇到上下文切换的另一种可能性。

即使如此，正如您所说，Windows 也不是实时操作系统，并且无论如何也不能保证遵守线程优先级。

回复收藏 0 原文