如何尽可能高效地处理大量并发磁盘写入请求

发布于 2024-12-04 17:58:21 字数 464 浏览 4 评论 0原文

假设以下方法被 .net 4 应用程序中的不同线程调用数千次。处理这种情况的最佳方法是什么？了解磁盘是这里的瓶颈，但我希望 WriteFile() 方法能够快速返回。

数据可达几MB。我们是在谈论线程池、TPL 之类的吗？

public void WriteFile(string FileName, MemoryStream Data)
{
   try
   {
      using (FileStream DiskFile = File.OpenWrite(FileName))
      {
         Data.WriteTo(DiskFile);
         DiskFile.Flush();
         DiskFile.Close();
      }
   }
   catch (Exception e)
   {
      Console.WriteLine(e.Message);
   }
}

原文

Say the method below is being called several thousand times by different threads in a .net 4 application. What’s the best way to handle this situation? Understand that the disk is the bottleneck here but I’d like the WriteFile() method to return quickly.

Data can be can be up to a few MB. Are we talking threadpool, TPL or the like?

public void WriteFile(string FileName, MemoryStream Data)
{
   try
   {
      using (FileStream DiskFile = File.OpenWrite(FileName))
      {
         Data.WriteTo(DiskFile);
         DiskFile.Flush();
         DiskFile.Close();
      }
   }
   catch (Exception e)
   {
      Console.WriteLine(e.Message);
   }
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

像极了他 2024-12-11 17:58:21

如果您想快速返回并且并不真正关心操作是否同步，您可以在内存中创建某种Queue，您将在其中放置写请求，当队列未填满时，您可以从方法返回迅速地。另一个线程将负责调度队列和写入文件。如果调用 WriteFile 并且队列已满，您将必须等待，直到可以排队并且执行将再次同步，但这样您就可以拥有一个大缓冲区，因此如果处理文件写入请求不是线性的，但是更加尖峰（写入文件调用尖峰之间有暂停），这种变化可以被视为性能的改进。

更新：
给你做了一张小图。请注意，瓶颈始终存在，您所能做的就是使用队列来优化请求。请注意，队列有限制，因此当其填满时，您无法将队列文件插入其中，您必须等待，以便该缓冲区中也有可用空间。但对于图片中显示的情况（3 个存储桶请求），很明显您可以快速将存储桶放入队列并返回，而在第一种情况下，您必须一个一个地执行该操作并阻止执行。

请注意，您永远不需要同时执行许多 IO 线程，因为它们都将使用相同的瓶颈，如果您尝试如此大量并行，您只会浪费内存，我相信 2 - 10 个线程顶部将轻松占用所有可用 IO 带宽，并且也会限制应用程序内存使用。

在此处输入图像描述

回复收藏 0 原文

紫瑟鸿黎 2024-12-11 17:58:21

既然你说文件不需要按顺序或立即写入，最简单的方法是使用 Task：

private void WriteFileAsynchronously(string FileName, MemoryStream Data)
{
    Task.Factory.StartNew(() => WriteFileSynchronously(FileName, Data));
}

private void WriteFileSynchronously(string FileName, MemoryStream Data)
{
    try
    {
        using (FileStream DiskFile = File.OpenWrite(FileName))
        {
            Data.WriteTo(DiskFile);
            DiskFile.Flush();
            DiskFile.Close();
        }
    }

    catch (Exception e)
    {
        Console.WriteLine(e.Message);
    }
}

TPL 在内部使用线程池，即使对于大量的任务。

Since you say that the files don't need to be written in order nor immediately, the simplest approach would be to use a Task:

private void WriteFileAsynchronously(string FileName, MemoryStream Data)
{
    Task.Factory.StartNew(() => WriteFileSynchronously(FileName, Data));
}

private void WriteFileSynchronously(string FileName, MemoryStream Data)
{
    try
    {
        using (FileStream DiskFile = File.OpenWrite(FileName))
        {
            Data.WriteTo(DiskFile);
            DiskFile.Flush();
            DiskFile.Close();
        }
    }

    catch (Exception e)
    {
        Console.WriteLine(e.Message);
    }
}

The TPL uses the thread pool internally, and should be fairly efficient even for large numbers of tasks.

回复收藏 0 原文

幻想少年梦 2024-12-11 17:58:21

如果数据传入的速度快于您记录的速度，那么您就有了真正的问题。生产者/消费者设计，其中 WriteFile 只是将内容扔到 ConcurrentQueue 或类似的结构中，并且一个单独的线程为该队列提供服务，效果很好......直到队列填满。如果您要打开 50,000 个不同的文件，备份速度会很快。更不用说每个文件可能有几兆字节的数据将进一步限制队列的大小。

我遇到过类似的问题，我通过将 WriteFile 方法附加到单个文件来解决。它写入的记录有记录号、文件名、长度，然后是数据。正如汉斯在对您最初问题的评论中指出的那样，写入文件很快； 打开文件速度很慢。

我的程序中的第二个线程开始读取 WriteFile 正在写入的文件。该线程读取每个记录头（编号、文件名、长度），打开一个新文件，然后将数据从日志文件复制到最终文件。

如果日志文件和最终文件位于不同的磁盘上，则效果更好，但它仍然可以在单个主轴上正常工作。不过，它确实会锻炼你的硬盘。

它的缺点是需要 2 倍的磁盘空间，但对于 150 美元以下的 2 TB 驱动器，我不认为有什么大问题。总体而言，它的效率也低于直接写入数据（因为您必须处理数据两次），但它的好处是不会导致主处理线程停止。

回复收藏 0 原文

寻找一个思念的角度 2024-12-11 17:58:21

将完整的方法实现封装在新的 Thread() 中。然后您可以“即发即忘”这些线程并返回到主调用线程。

    foreach (file in filesArray)
    {
        try
        {
            System.Threading.Thread updateThread = new System.Threading.Thread(delegate()
                {
                    WriteFileSynchronous(fileName, data);
                });
            updateThread.Start();
        }
        catch (Exception ex)
        {
            string errMsg = ex.Message;
            Exception innerEx = ex.InnerException;
            while (innerEx != null)
            {
                errMsg += "\n" + innerEx.Message;
                innerEx = innerEx.InnerException;
            }
            errorMessages.Add(errMsg);
        }
    }

Encapsulate your complete method implementation in a new Thread(). Then you can "fire-and-forget" these threads and return to the main calling thread.

    foreach (file in filesArray)
    {
        try
        {
            System.Threading.Thread updateThread = new System.Threading.Thread(delegate()
                {
                    WriteFileSynchronous(fileName, data);
                });
            updateThread.Start();
        }
        catch (Exception ex)
        {
            string errMsg = ex.Message;
            Exception innerEx = ex.InnerException;
            while (innerEx != null)
            {
                errMsg += "\n" + innerEx.Message;
                innerEx = innerEx.InnerException;
            }
            errorMessages.Add(errMsg);
        }
    }

回复收藏 0 原文

~没有更多了~