高性能文件操作和I/O 完成线程

发布于 2024-09-25 06:59:12 字数 530 浏览 2 评论 0原文

关于文件性能的两个问题:

我需要建立一个服务器来处理潜在的数千个同时请求:

  • 文件散列文件
  • 压缩文件
  • 解压缩文件
  • 可能还有一些文件复制/移动

我无法控制客户的硬件(RAID配置,等)所以我假设我能做的就是请求数百个文件操作,并允许操作系统和磁盘控制器提供它们可以提供的任何优化。正确的?

下一个问题:我想最大限度地利用 I/O 完成线程(而不是工作线程)。我相信唯一可以通过.net 3.5使用的,是通过以下位置的“BeginRead/Write”提供的:

  • System.IO.Compression.DeflateStream
  • System.IO.Compression.GZipStream
  • System.IO.FileStream
  • System.IO.Stream

我是否缺少一些东西可以让我能够使用 I/O 完成线程来散列文件? 7Zip SDK 是否使用 I/O 完成线程?

Two questions on file performance:

I need to make a server that handles potentially thousands of simultaneous requests for:

  • Hashing of files
  • Compression of files
  • Decompression of files
  • Possibly some file copy / moves as well

I can't control a customer's hardware (RAID configurations, etc) so I assume all I can do is request hundreds of file operations, and allow the OS and disc controller to provide whatever optimizations they can. Correct?

Next question: I would like to maximize use of I/O completion threads (instead of worker threads). The only ones I believe are available to me, via .net 3.5 anyway, are offered via "BeginRead/Write" in:

  • System.IO.Compression.DeflateStream
  • System.IO.Compression.GZipStream
  • System.IO.FileStream
  • System.IO.Stream

Is there something I'm missing that would give me the ability to use an I/O completion thread for hashing files? Does the 7Zip SDK use I/O completion threads?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

丢了幸福的猪 2024-10-02 06:59:12

首先,虽然 .NET 在性能方面相当不错,但如果非常高的性能是基本要求,我会转向本机编译的非托管语言,例如 C++。 JIT 编译和 CLR 的其他开销将降低用 .NET 编写的任何算法的性能。

我认为数千个真正同时发生的请求将表明一个高度分布式的模型;目前,市场上最好的服务器硬件(双 Xeon 四核超线程 CPU)一次只能执行 32 件事,并且侦听执行任务的请求、与硬件层通信以及其他一般操作系统/运行时开销将占用上面几个。我会分析您期望该服务器同时处理的实际流量,并调整您在其上工作的框的数量以匹配。

其次,我认为您所说的“I/O 完成线程”是异步 Begin/End 调用用来完成其工作的线程,而不是来自 ThreadPool 的线程(避免在真正线程繁重的应用程序中使用) )或用户创建的线程(这些没有问题,只需注意线程数)。实际上,除了一些特殊情况外,线程就是一个线程,并且它生成的确切位置在硬件级别上没有太大区别,因此如果您确实想要,生成使用同步调用的工作线程会让您很满意结果大致相同(但通常使用现有的工具比创建新的工具更好)。

现在,回答你真正的问题。不,不存在用于哈希的异步模型;如果要对哈希操作进行多线程处理,则必须单独生成线程。但是,散列需要一个流或字节缓冲区,可以使用 Stream.BeginRead() 异步获取,并且传递给 BeginRead() 的回调方法可以在异步调用生成的线程中执行散列。

First, while .NET is pretty good performance-wise, if very high performance is a basic requirement, I would turn to a native-compiled, unmanaged language like C++. JIT compilation and the other overheads of the CLR are going to slow down the performance of any algorithm written in .NET.

I think that thousands of truly simultaneous requests are going to indicate a highly distributed model; right now, the best server hardware on the market (dual Xeon quad-core hyperthreading CPUs) will only do 32 things at once, and listening for requests to do things, talking to the hardware layer, and other general OS/runtime overhead will take up a couple of those. I would analyze the real traffic you expect this server to handle concurrently, and scale the number of boxes you have working on it to match.

Second, I think that you're talking about when you say "I/O completion threads" are the threads that the asynchronous Begin/End calls use to do their job, instead of threads from the ThreadPool (avoid in really thread-heavy apps) or user-created threads (no problem with these, just watch your thread count). Really, except for a few special cases, a thread is a thread, and exactly where it's spawned doesn't make much difference at the hardware level, so if you really wanted to, spawning worker threads that used the synchronous calls would get you pretty much the same result (but it's generally better to use the tools you have rather than forge new ones).

Now, to your real question. No, there is not an asynchronous model for hashing; if you want to multithread a hashing operation, the thread must be spawned seperately. However, hashing requires a stream or byte buffer, which can be obtained asynchronously using Stream.BeginRead(), and the callback method passed to BeginRead() can perform the hashing in the thread that the asynchronous call spawned.

空城旧梦 2024-10-02 06:59:12

我建议研究 F# 中新的异步编程模型。 Luke Hoban 在新奥尔良举办的 MS TechEd 2010 上有一个关于这个主题的精彩视频:

http:// www.msteched.com/2010/NorthAmerica/DEV307

http://blogs.msdn.com/b/lukeh/archive/2010/06/13/f-scaling-from-探索性网络组件-f-talk-teched-2010.aspx

I would recommend looking into the new async programming model in F#. There's an excellent video from MS TechEd 2010 in New Orleans by Luke Hoban on this very topic:

http://www.msteched.com/2010/NorthAmerica/DEV307

http://blogs.msdn.com/b/lukeh/archive/2010/06/13/f-scaling-from-explorative-to-net-component-f-talk-teched-2010.aspx

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文