vs2008 c#:线程池问题

发布于 2024-08-14 17:51:42 字数 1808 浏览 2 评论 0原文

我正在使用以下2种方法。名为 DoMyWork1 的方法确实可以很好地扩展,因为在 3 个线程中运行其中的三个需要 6 秒。而 DoMyJob 方法根本无法扩展。如果一个线程需要 4 秒,那么运行 3 个线程需要 13 秒。我做错了什么?文件读取和/或写入是否需要除线程池之外的特殊线程处理?

我的调用代码

public static void Process(MyDelegate md , int threads)
{
    int threadcount = threads;

    ManualResetEvent[] doneEvents = new ManualResetEvent[threadcount];

    DateTime dtstart = DateTime.Now;

    List<string> myfiles = GetMyFiles(@"c:\");


    for (int i = 0; i < threadcount; i++)
    {

        doneEvents[i] = new ManualResetEvent(false);
        MyState ms = new MyState();
        ms.ThreadIndex = i;
        ms.EventDone = doneEvents[i];
        ms.files = myfiles;
        ThreadPool.QueueUserWorkItem(md.Invoke, ms);
    }


    WaitHandle.WaitAll(doneEvents);

    DateTime dtend = DateTime.Now;
    TimeSpan ts = dtend - dtstart;
    Console.WriteLine("All complete in {0} seconds.", ts.ToString());
    Console.ReadLine();

}

public static void DoMyWork1(Object threadContext)
{
    MyState st = (MyState)threadContext;
    Console.WriteLine("thread {0} started...", st.ThreadIndex);

    Thread.Sleep(5000);

    Console.WriteLine("thread {0} finished...", st.ThreadIndex);
    st.EventDone.Set();
}



private static void DoMyJob(MyState st)
{
    Console.WriteLine("I am in thread {0} started...", st.ThreadIndex);


    string[] mystrings = new string[] { "one", "two", "three" };

    foreach (string s in mystrings)
    {
        foreach (string file in st.files)
        {
            if (!(new StreamReader(file).ReadToEnd().Contains(s)))
            {
                AppendToFile(String.Format("{0} word searching in file {1} in thread {2}", s, file, st.ThreadIndex));
            }


        }
    }

    Console.WriteLine("I am in thread {0} ended...", st.ThreadIndex);
}

I am using the following 2 methods. Method called DoMyWork1 does scale well like it takes 6 seconds to run three of them in 3 threads. Whereas DoMyJob method does not scale at all. If one thread takes 4 seconds then it takes 13 seconds to run 3 threads. What am I doing wrong? Does file read and/or write needs special thread handling other than thread pool?

My calling code

public static void Process(MyDelegate md , int threads)
{
    int threadcount = threads;

    ManualResetEvent[] doneEvents = new ManualResetEvent[threadcount];

    DateTime dtstart = DateTime.Now;

    List<string> myfiles = GetMyFiles(@"c:\");


    for (int i = 0; i < threadcount; i++)
    {

        doneEvents[i] = new ManualResetEvent(false);
        MyState ms = new MyState();
        ms.ThreadIndex = i;
        ms.EventDone = doneEvents[i];
        ms.files = myfiles;
        ThreadPool.QueueUserWorkItem(md.Invoke, ms);
    }


    WaitHandle.WaitAll(doneEvents);

    DateTime dtend = DateTime.Now;
    TimeSpan ts = dtend - dtstart;
    Console.WriteLine("All complete in {0} seconds.", ts.ToString());
    Console.ReadLine();

}

public static void DoMyWork1(Object threadContext)
{
    MyState st = (MyState)threadContext;
    Console.WriteLine("thread {0} started...", st.ThreadIndex);

    Thread.Sleep(5000);

    Console.WriteLine("thread {0} finished...", st.ThreadIndex);
    st.EventDone.Set();
}



private static void DoMyJob(MyState st)
{
    Console.WriteLine("I am in thread {0} started...", st.ThreadIndex);


    string[] mystrings = new string[] { "one", "two", "three" };

    foreach (string s in mystrings)
    {
        foreach (string file in st.files)
        {
            if (!(new StreamReader(file).ReadToEnd().Contains(s)))
            {
                AppendToFile(String.Format("{0} word searching in file {1} in thread {2}", s, file, st.ThreadIndex));
            }


        }
    }

    Console.WriteLine("I am in thread {0} ended...", st.ThreadIndex);
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

陌生 2024-08-21 17:51:42

仅当程序缺乏 CPU 资源时,线程才能提高程序性能。您的程序并非如此,它应该可以从 Taskmgr.exe 性能选项卡中轻松看到。这里的慢速资源是您的硬盘或网卡。 ReadToEnd() 调用非常慢,需要等待磁盘检索文件数据。您对文件数据执行的任何其他操作都比这快 3 个数量级。

线程将依次等待磁盘数据。事实上,线程很可能会让你的程序运行速度变慢很多。它们将导致磁盘驱动器磁头在磁盘上不相交的磁道之间来回跳转,因为每个线程正在处理不同的文件。 真正慢的一件事是导致磁头寻找另一条轨道。对于快速磁盘,通常约为 10 毫秒。相当于大约50万条CPU指令。

除非获得更快的磁盘,否则无法使程序运行得更快。 SSD 不错。当心文件系统缓存的影响,第二次运行程序时,当从缓存而不是磁盘检索文件数据时,程序会运行得非常快。这种情况在生产环境中很少发生。

Threads can improve program perf only if the program is starved for CPU resources. That's not the case for your program, it should be readily visible from the Taskmgr.exe Performance tab. The slow resource here is your hard disk, or the network card. The ReadToEnd() call is glacially slow, waiting for the disk to retrieve the file data. Anything else you do with the file data is easily 3 orders of magnitude faster than that.

The threads will just wait in turn for the disk data. In fact, there's a good chance that the threads actually make your program run a lot slower. They will cause the disk drive head to jump back-and-forth between disjoints tracks on the disk since each thread is working with a different file. The one thing that is really slow is causing the head to seek to another track. Typically around 10 msec for a fast disk. Equivalent to about half a million CPU instructions.

You can't make your program run faster unless you get a faster disk. SSDs are nice. Beware of effects of the file system cache, the second time you run your program it will run very fast when the file data is retrieved from the cache instead of the disk. This will rarely happen in a production environment.

叶落知秋 2024-08-21 17:51:42

所有文件访问都将在操作系统层中串行化,因此对其进行线程化将导致您所看到的结果。

All file access will become serial in the OS layer and threading it as such is going to result in exactly what you see.

风渺 2024-08-21 17:51:42

我有点惊讶 - 我希望对这些文件的第一次访问能够缓存,然后剩余的访问只会访问内存。所以三个线程不应该比一个线程慢太多。如果您要写入每个文件,那就会产生影响 - AppendToFile 函数到底是做什么的?

I'm a little suprised - I'd expect the first access to these files to cache, and then remaining accesses just hit memory. so three threads shouldn't be too much slower than one. If you're writing to each file, that would make a difference - what exactly does the AppendToFile function do?

乱世争霸 2024-08-21 17:51:42

一个问题可能是您正在打开并读取每个文件,以查找您要查找的每个新字符串。

如果您切换 foreach 循环的顺序并仅根据需要附加到文件中,会发生什么情况?

我想你会看到更好的表现。

理想情况下,如果您可以将文件读取完全从循环中取出,那将是最快的。 I/O 绑定操作总是会导致上下文切换等待磁盘返回数据。

One problem could be that you are opening and reading each file, for each new string you are looking for.

What would happen if you switched the order of your foreach loops and only appended to the file as needed?

I think you would see much better performance.

Ideally if you can take the file reading out of the loop altogether, that would be the fastest. I/O bound operations will always cause context switches waiting on the disk to return the data.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文