汇编中的文件压缩器
为了更好地进行汇编编程,并且作为一项学术练习,我想在 x86 汇编中编写一个重要的程序。由于我一直对文件压缩感兴趣,因此我想在汇编中编写类似 zip
实用程序的东西。
我在这里并没有完全超出我的范围,我已经使用汇编编写了一个简单的 Web 服务器并为嵌入式设备进行了编码,并且我已经阅读了 zlib(和其他)的一些材料并使用了它的 C 实现。
我的问题是找到一个足够简单以移植到程序集的例程。到目前为止,我检查过的许多实用程序都充满了#define 和其他包含的代码。因为这只是给我玩的,所以我对超棒的压缩比或类似的东西并不真正感兴趣。我基本上只是在寻找 RC4 的压缩算法。
霍夫曼编码是我应该向下看的路径还是有人有其他建议?
In an effort to get better at programming assembly, and as an academic exercise, I would like to write a non-trivial program in x86 assembly. Since file compression has always been kind of an interest to me, I would like to write something like the zip
utility in assembly.
I'm not exactly out of my element here, having written a simple web server using assembly and coded for embedded devices, and I've read some of the material for zlib (and others) and played with its C implementation.
My problem is finding a routine that is simple enough to port to assembly. Many of the utilities I've inspected thus far are full of #define
's and other included code. Since this is really just for me to play with, I'm not really interested in super-awesome compression ratios or anything like that. I'm basically just looking for the RC4 of compression algorithms.
Is a Huffman Coding the path I should be looking down or does anyone have another suggestion?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这里有一个更复杂的算法,应该不会太难实现:LZ77(包含汇编示例) 或 LZ77(此站点包含许多不同的压缩算法)。
And here is a more sophisticated algorithm which should not be too hard to implement: LZ77 (containing assembly examples) or LZ77 (this site contains many different compression algorithms).
一种选择是为 DEFLATE (zip 和 gzip 背后的算法)编写一个解压缩器。 zlib 的实现将进行大量优化,但 RFC 给出了解码器的伪代码。了解压缩格式后,您可以继续基于它编写压缩器。
One option would be to write a decompressor for DEFLATE (the algorithm behind zip and gzip). zlib's implementation is going to be heavily optimized, but the RFC gives pseudocode for a decoder. After you have learned the compressed format, you can move on to writing a compressor based on it.
我记得计算机科学二年级的一个项目与此类似(用 C 语言)。
基本上,压缩涉及用
@\005x
(at 符号,一个值为 5 的字节,后跟重复的字节)替换一个xxxxx
(5 个 x)字符串。该算法非常简单。它对于英文文本效果不佳,但对于位图图像效果却出奇地好:我所描述的是游程编码。
I remember a project from second year computing science that was something similar to this (in C).
Basically, compressing involves replacing a string of
xxxxx
(5 x's) with@\005x
(the at sign, a byte with a value of 5, followed by the repeated byte. This algorithm is very simple. It doesn't work that well for English text, but works surprisingly well for bitmap images.Edit: what I am describing is run length encoding.
查看 UPX 可执行加壳程序。它包含一些低级解压缩代码作为解包过程的一部分......
Take a look at UPX executable packer. It contains some low-level decompressing code as part of unpacking procedures...