仿真给出了正常 for 循环与并行 For 的不同结果

发布于 2024-12-28 00:06:47 字数 2266 浏览 4 评论 0原文

当我尝试使用普通 for 循环(这是正确的结果)与 Parallel For 时,我对一个简单模拟示例的不同结果感到有点惊讶。请帮我找出可能是什么原因。我观察到,与正常情况相比,并行执行速度非常快。

using System;
using System.Collections.Generic;
using System.Threading.Tasks;

namespace Simulation
{
    class Program
    {

    static void Main(string[] args)
    {
       ParalelSimulation(); // result is .757056
       NormalSimulation();  // result is .508021 which is correct
        Console.ReadLine();
    }

    static void ParalelSimulation()
    {
        DateTime startTime = DateTime.Now;

        int trails = 1000000;
        int numberofpeople = 23;
        Random rnd = new Random();
        int matches = 0;

        Parallel.For(0, trails, i =>
            {
                var taken = new List<int>();
                for (int k = 0; k < numberofpeople; k++)
                {
                   var day = rnd.Next(1, 365);
                    if (taken.Contains(day))
                    {
                        matches += 1;
                        break;
                    }
                    taken.Add(day);
                }
            }
        );
        Console.WriteLine((Convert.ToDouble(matches) / trails).ToString());
        TimeSpan ts = DateTime.Now.Subtract(startTime);
        Console.WriteLine("Paralel Time Elapsed: {0} Seconds:MilliSeconds", ts.Seconds + ":" + ts.Milliseconds);
    }
    static void NormalSimulation()
    {
        DateTime startTime = DateTime.Now;

        int trails = 1000000;
        int numberofpeople = 23;
        Random rnd = new Random();
        int matches = 0;

        for (int j = 0; j < trails; j++)
        {
            var taken = new List<int>();
            for (int i = 0; i < numberofpeople; i++)
            {
               var day = rnd.Next(1, 365);
                if (taken.Contains(day))
                {
                    matches += 1;
                    break;
                }
                taken.Add(day);
            }
        }
        Console.WriteLine((Convert.ToDouble(matches) / trails).ToString());
        TimeSpan ts = DateTime.Now.Subtract(startTime);
        Console.WriteLine(" Time Elapsed: {0} Seconds:MilliSeconds", ts.Seconds + ":" + ts.Milliseconds);
    }
}

}

提前致谢

I am bit surprised with different results for one of my simple simulation sample when I tried with normal for loop ( which is correct result) Vs Parallel For. Please help me to find what could be the reason. I observed that Parallel execution is so fast compare to normal.

using System;
using System.Collections.Generic;
using System.Threading.Tasks;

namespace Simulation
{
    class Program
    {

    static void Main(string[] args)
    {
       ParalelSimulation(); // result is .757056
       NormalSimulation();  // result is .508021 which is correct
        Console.ReadLine();
    }

    static void ParalelSimulation()
    {
        DateTime startTime = DateTime.Now;

        int trails = 1000000;
        int numberofpeople = 23;
        Random rnd = new Random();
        int matches = 0;

        Parallel.For(0, trails, i =>
            {
                var taken = new List<int>();
                for (int k = 0; k < numberofpeople; k++)
                {
                   var day = rnd.Next(1, 365);
                    if (taken.Contains(day))
                    {
                        matches += 1;
                        break;
                    }
                    taken.Add(day);
                }
            }
        );
        Console.WriteLine((Convert.ToDouble(matches) / trails).ToString());
        TimeSpan ts = DateTime.Now.Subtract(startTime);
        Console.WriteLine("Paralel Time Elapsed: {0} Seconds:MilliSeconds", ts.Seconds + ":" + ts.Milliseconds);
    }
    static void NormalSimulation()
    {
        DateTime startTime = DateTime.Now;

        int trails = 1000000;
        int numberofpeople = 23;
        Random rnd = new Random();
        int matches = 0;

        for (int j = 0; j < trails; j++)
        {
            var taken = new List<int>();
            for (int i = 0; i < numberofpeople; i++)
            {
               var day = rnd.Next(1, 365);
                if (taken.Contains(day))
                {
                    matches += 1;
                    break;
                }
                taken.Add(day);
            }
        }
        Console.WriteLine((Convert.ToDouble(matches) / trails).ToString());
        TimeSpan ts = DateTime.Now.Subtract(startTime);
        Console.WriteLine(" Time Elapsed: {0} Seconds:MilliSeconds", ts.Seconds + ":" + ts.Milliseconds);
    }
}

}

Thanks in Advance

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

撩动你心 2025-01-04 00:06:47

有几点:

  1. Random 类不是线程安全的。每个工作线程需要一个新的 Random 实例。
  2. 您正在以非线程安全的方式递增 matches 变量。您可能需要使用 Interlocked.Increment(ref matches) 来保证递增变量的线程安全。
  3. 您的 for 循环和 Parallel::For 执行的次数不完全相同,因为您在 for 循环中执行了 <= 并且 Parallel::For 的第二个参数是独占,因此您需要在这种情况下将 1 添加到踪迹中以使它们相等。

试试这个:

static void ParalelSimulationNEW()
{
    DateTime startTime = DateTime.Now;

    int trails = 1000000;
    int numberofpeople = 23;
    int matches = 0;

    Parallel.For(0, trails + 1, _ =>
    {
        Random rnd = new Random();

        var taken = new List<int>();
        for(int k = 0; k < numberofpeople; k++)
        {
            var day = rnd.Next(1, 365);
            if(taken.Contains(day))
            {
                Interlocked.Increment(ref matches);
                break;
            }
            taken.Add(day);
        }
    });
    Console.WriteLine((Convert.ToDouble(matches) / trails).ToString());
    TimeSpan ts = DateTime.Now.Subtract(startTime);
    Console.WriteLine("Paralel Time Elapsed: {0} Seconds:MilliSeconds", ts.Seconds + ":" + ts.Milliseconds);
}

Few things:

  1. The Random class is not thread safe. You would need a new Random instance per worker thread.
  2. You're incrementing the matches variable in a non-thread safe way. You would want to use Interlocked.Increment(ref matches) to guarantee thread safety around incrementing the variable.
  3. Your for loop and your Parallel::For are not executing the exact same number of times because you do a <= in your for loop and Parallel::For's second parameter is exclusive, so you would need to add 1 to trails in that case to make them equivalent.

Try this:

static void ParalelSimulationNEW()
{
    DateTime startTime = DateTime.Now;

    int trails = 1000000;
    int numberofpeople = 23;
    int matches = 0;

    Parallel.For(0, trails + 1, _ =>
    {
        Random rnd = new Random();

        var taken = new List<int>();
        for(int k = 0; k < numberofpeople; k++)
        {
            var day = rnd.Next(1, 365);
            if(taken.Contains(day))
            {
                Interlocked.Increment(ref matches);
                break;
            }
            taken.Add(day);
        }
    });
    Console.WriteLine((Convert.ToDouble(matches) / trails).ToString());
    TimeSpan ts = DateTime.Now.Subtract(startTime);
    Console.WriteLine("Paralel Time Elapsed: {0} Seconds:MilliSeconds", ts.Seconds + ":" + ts.Milliseconds);
}
爱已欠费 2025-01-04 00:06:47

该代码包含关于比赛更新的数据竞赛。如果两个线程同时执行此操作,则两个线程都可以读取相同的值(例如 10),然后都将其递增(到 11)并将新值写回。因此,注册的匹配项将会减少(在我的示例中,是 11 个而不是 12 个)。解决方案是使用 System.Threading.Interlocked 对于这个变量。

我看到的其他问题:
- 串行循环包含 j 等于 trails 的迭代,而并行循环则不包含(结束索引在 Parallel.For);
- 类 Random 可能不是线程安全。


更新:我认为您无法通过 Drew Marsh 的代码得到您想要的结果,因为它没有提供足够的随机化。 每个 1M 实验都以完全相同的随机数开始,因为您使用默认种子启动 Random 的所有本地实例。本质上,您重复相同的实验 100 万次,因此结果仍然存在偏差。要解决这个问题,您需要每次为每个随机化器设置一个新值。 更新:我在这里并不完全正确,因为默认初始化使用系统时钟作为种子;然而,MSDN 警告说

由于时钟的分辨率有限,因此使用无参数构造函数连续创建不同的 Random 对象会创建产生相同随机数序列的随机数生成器。

因此,这仍然可能是随机化不足的原因,并且使用显式种子,您可能会得到更好的结果。例如,使用外循环迭代的次数进行初始化为我提供了一个很好的答案:

Parallel.For(0, trails + 1, j =>
{
    Random rnd = new Random(j); // initialized with different seed each time
    /* ... */          
});

但是,我注意到在 Random 的初始化移入循环后,所有加速都丢失了(在我的英特尔酷睿 i5 笔记本电脑)。由于我不是 C# 专家,所以我不知道为什么;但我认为类 Random 可能有一些由所有实例共享的数据,并且访问同步。


更新 2:使用 ThreadLocal为了在每个线程中保留一个 Random 实例,我获得了良好的准确性和合理的加速:

ThreadLocal<Random> ThreadRnd = new ThreadLocal<Random>(() =>
{
    return new Random(Thread.CurrentThread.GetHashCode());
});
Parallel.For(0, trails + 1, j =>
{
    Random rnd = ThreadRnd.Value;
    /* ... */          
});

请注意如何使用当前运行的实例的哈希码来初始化每个线程随机化器线程

The code contains a data race on the update of matches. If two threads do it simultaneously, both can read the same value of it (say, 10), then both increment it (to 11) and write the new value back. As a result, there will be less registered matches (in my example, 11 instead of 12). The solution is to use System.Threading.Interlocked for this variable.

Other issues I see:
- your serial loop includes an iteration for j equal to trails while the parallel loop does not (the end index is exclusive in Parallel.For);
- class Random might be not thread safe.


Update: I think you do not get the result you want with Drew Marsh's code because it does not provide enough randomization. Each of 1M experiments starts with exactly the same random number, because you initiate all local instances of Random with the default seed. Essentially, you repeat the same experiment 1M times, so the result is still skewed. To fix that, you need to seed each randomizer with a new value each time. Update: I was not totally correct here, as the default initialization uses system clock for the seed; however, MSDN warns that

because the clock has finite resolution, using the parameterless constructor to create different Random objects in close succession creates random number generators that produce identical sequences of random numbers.

So this still might be the reason of insufficient randomization, and with explicit seeds you might get better results. For example, initializing with the number of the outer loop iteration provided a good answer for me:

Parallel.For(0, trails + 1, j =>
{
    Random rnd = new Random(j); // initialized with different seed each time
    /* ... */          
});

However, I noticed that after the initialization of Random was moved into the loop, all the speedup was lost (on my Intel Core i5 laptop). Since I am not a C# expert, I do not know why; but I suppose that class Random might have some data shared by all instances with synchronization of access.


Update 2: With the use of ThreadLocal for keeping one instance of Random per thread, I've got both good accuracy and reasonable speedup:

ThreadLocal<Random> ThreadRnd = new ThreadLocal<Random>(() =>
{
    return new Random(Thread.CurrentThread.GetHashCode());
});
Parallel.For(0, trails + 1, j =>
{
    Random rnd = ThreadRnd.Value;
    /* ... */          
});

Notice how the per-thread randomizers are initialized with the hash code for the currently running instance of Thread.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文