没有字典的压缩

发布于 2025-01-28 01:36:47 字数 931 浏览 2 评论 0原文

我一直在测试使用木准文件的各种压缩算法,并在ZSTD上解决。

据我了解,据我了解,除非明确指定一个ZSTD,否则使用自适应词典,因此它以空的字典开头。但是,当具有字典时,启用了压缩大小,并且执行时间非常令人满意。

与使用自适应的文件大小相比,不使用字典的文件大小要小得多。 (名称末尾的数字是压缩级别):

  • 名称:c:\ parquetfiles \ zstd1执行时间:279 ms大小:13738134
  • 名称:c:\ parquetfiles \ ZSTD2执行时间:140 ms size:13207017名称:C 13207017名称
  • :C 13207017名称:C :\ParquetFiles\Zstd9 Execution time: 511 ms Size: 12701030

And for comparison the log from using the adaptive dictionary:

  • Name: C:\ParquetFiles\ZstdDictZstd1 Execution time: 487 ms Size: 19462825
  • Name: C:\ParquetFiles\ZstdDictZstd2 Execution time :402 ms尺寸:19292513
  • 名称:c:\ parquetfiles \ zstddictzstd9执行时间:614 ms大小:19072779

您能帮助我理解此意义的意义,不应使用空词典的输出,至少像ZSTD压缩一样好,词典禁用?

I have been testing the various compression algorithms with parquet files, and have settled on Zstd.

Now as far as I understand Zstd uses adaptive dictionary unless one is explicitly specified, thus it begins with an empty one. However when having a dictionary enabled the compressed size and and the execution time are quite unsatisfactory.

enter image description here

The file size without using a dictionary is quite less compared to using the adaptive one. (The number at the end of the name is the compression level):

  • Name: C:\ParquetFiles\Zstd1 Execution time: 279 ms Size: 13738134
  • Name: C:\ParquetFiles\Zstd2 Execution time: 140 ms Size: 13207017
  • Name: C:\ParquetFiles\Zstd9 Execution time: 511 ms Size: 12701030

And for comparison the log from using the adaptive dictionary:

  • Name: C:\ParquetFiles\ZstdDictZstd1 Execution time: 487 ms Size: 19462825
  • Name: C:\ParquetFiles\ZstdDictZstd2 Execution time: 402 ms Size: 19292513
  • Name: C:\ParquetFiles\ZstdDictZstd9 Execution time: 614 ms Size: 19072779

Can you help me understand the significance of this, shouldn't the output with an empty dictionary perform at least as good as Zstd compression with dictionary disabled?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文