使用 DeflateStream 时可以使用更少的内存吗?

发布于 2024-07-22 06:06:13 字数 363 浏览 11 评论 0原文

我的应用程序需要解压缩包含大量 Deflate 压缩块(以及其他类型的压缩和加密)的文件。 内存分析显示 deflate 流构造函数负责在其生命周期内分配应用程序的大部分内存(54.19%,其次是 DeflateStream.read,占 12.96%,其他所有内存均低于 2%)。

从长远来看,每个文件块通常为 4KiB(解压缩后),DeflateStream 的构造函数分配略多于 32KiB(大概是用于滑动窗口)。 垃圾收集器有一个忙碌的日子,因为所有这些泄气流几乎不会持续任何时间(每个在下一个进入之前就消失了)! 再见缓存效率。

我可以继续使用 DeflateStream,但我想知道是否有更好的替代方案。 也许有一种方法可以重置流并再次使用它?

My application needs to decompress files which contain a lot of Deflate compressed blocks (as well as other types of compression and encryption). Memory profiling shows that the deflate stream constructor is responsible for allocating the majority of the application's memory over its lifetime (54.19%, followed by DeflateStream.read at 12.96%, and everything else under 2%).

To put it in perspective, each file block is usually 4KiB (decompressed) and DeflateStream's constructor allocates slightly more than 32KiB (presumably for the sliding window). The garbage collector has a field day as all these deflate streams last for almost no time at all (each goes away before the next one comes in)! Goodbye cache efficiency.

I can keep using DeflateStream, but I am wondering if there's a better alternative. Maybe a way to reset the stream and use it again?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

无畏 2024-07-29 06:06:13

有两条评论没有任何实际测量的好处来支持这一点:

  • 我想您会发现这些临时缓冲区的分配(和归零)所花费的时间与实际解压缩所花费的时间相比可以忽略不计。
  • 事实上,这些缓冲区是高度瞬态的,这意味着尽管在应用程序的生命周期中它可能占内存的 50%,但它们都不会同时存在。 请注意,这也不会对缓存效率造成太大影响...我想这些缓冲区中的大多数都不会比缓存内存中的使用时间长得多,因为页面很快就会过时。

简而言之,除非您对 deflate 流有可测量的问题(无论是速度还是绝对内存使用),我都会继续使用它......最好使用您知道的解决方案,而不是引入另一种可能完全不同的解决方案一系列更难处理的问题。

Two comments without the benefit of any actual measurement to back this up:

  • I think you'll find that the amount of time taken by the allocations (and zeroing) of these temporary buffers is negligible next to the time spent on the actual decompression.
  • The fact that these buffers are highly transient means that although over the lifetime of the app it may be 50% of the memory, none of it exists simultaneously. Note that this also should not hurt the cache efficiency much... I'd imagine that most of these buffers will not outlive their use by much in the cache memory, because the pages would go stale very quickly.

In short, unless you have a measurable problem with the deflate stream (either in speed or absolute memory use), I'd just keep using it... better to use the solution you know than introduce another one that may have a whole different set of problems that are harder to deal with.

海拔太高太耀眼 2024-07-29 06:06:13

您是否有任何实际的性能问题,或者您只是担心内存使用情况?

大多数对象都是短暂的,因此内存管理和垃圾收集器的构建是为了有效地处理短期对象。 框架中的许多类被设计为只使用一次,然后就被丢弃,以使其寿命更短。

如果您尝试保留对象,它们更有可能在垃圾回收中幸存下来,这意味着它们将从一个堆代移动到另一代。 堆的分代不仅仅是对象的逻辑划分,而是对象实际上从一个内存区域移动到另一个内存区域。 垃圾收集器通常的工作原理是将堆中的所有存活对象移动到下一代,然后清空堆,因此长期存活的对象是昂贵的,而不是短期存活的对象。

由于这种设计,内存吞吐量较高而实际内存使用量较低是很正常的。

Do you have any actual performance problems, or is it that you are just worried about the memory usage?

Most objects are short lived, so the memory management and the garbage collector is built to handle short lived objects efficiently. A lot of classes in the framework is designed to be used once and then thrown away to be more short lived.

If you try to hang on to objects they are more likely to survive a garbage collection, which means that they will be moved from one heap generation to another. The heap generations is not just a logical division of object, but the object is actually moved from one memory area to another. The garbage collector usually works with the principle of moving all the live objects in a heap to the next generation and then just empty out the heap, so it's the long lived objects that are costly, not the short lived objects.

Due to this design it's quite normal for the memory throughput to be high while the actual memory usage stays low.

忆悲凉 2024-07-29 06:06:13

DotNetZip 中有一个 DeflateStream,它有效地替代了 .NET BCL 中的内置 DeflateStream。 Ionic.Zlib.DeflateStream 具有可调的缓冲区大小。 我不知道它是否会在您的场景中带来更好的内存效率,但可能值得一试。 这是文档

我没有测试解压,而是测试压缩。 在我的测试中,我发现对于我压缩的数据子集,将缓冲区大小扩展到 4k 以上的回报有限。 另一方面,即使缓冲区为 1024 字节,您仍然可以获得准确、正确的压缩,尽管效率较低。 我想您在减压中也会看到类似的结果。

在任何一种情况下,窗口大小都不能直接从公共接口设置。 但是,它是开源的,您将能够轻松地根据需要修改默认的 Wwindow 大小。 另外,如果您认为它有价值,我可以接受请求,将窗口大小公开为 DeflateStream 上的可设置参数。 我没有公开它,因为没有人要求它。 然而?

你说你还有其他压缩。 如果您正在使用 Zlib 或 GZip,DotNetZip 包中也有 ZlibStream 和 GZipStream。

如果您想要处理 Zip 文件,则需要完整的 DotNetZip 库(Ionic.Zip.dll,大约 400k)。 如果你只是做{Deflate, Zlib, GZip}Stream,那么就有一个Ionic.Zlib.dll,大约90k。

DotNetZip 是免费的,但鼓励捐款

There's a DeflateStream in DotNetZip, effectively a replacement of the built-in DeflateStream in the .NET BCL. Ionic.Zlib.DeflateStream has a tunable buffer size. I don't know if it will result in better memory efficiency in your scenario, but it may be worth a try. Here's the doc.

I did not test decompression, but rather compression. In my tests I found limited returns on expanding the buffer size beyond 4k, for the subset of data I compressed. On the other hand, you still get accurate, correct compression, although it is less effective, even if the buffer is 1024 bytes. I suppose you would see similar results in decompression.

In either case, the window size is not directly settable from the public interface. But, it is open source and you will be able to easily modify the default Wwindow size as appropriate. Also, if you think it's valuable, I could take a request to expose the window size as a settable param on the DeflateStream. I haven't exposed it because no one has asked for it. Yet?

You said you had other compression, too. If you're doing Zlib or GZip, there's a ZlibStream and a GZipStream in the DotNetZip package, too.

If you want to do Zip files, you need the full DotNetZip library (Ionic.Zip.dll, at ~400k). If you are just doing {Deflate, Zlib, GZip}Stream, then there is a Ionic.Zlib.dll, which is about 90k.

DotNetZip is free, but donations are encouraged.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文