使用 UPX 压缩 Windows 可执行文件有什么缺点吗?
我之前曾使用 UPX 来减小 Windows 可执行文件的大小,但我必须承认我很天真这可能产生的任何负面影响。 所有这些打包/拆包的缺点是什么?
是否存在任何人会建议不要对可执行文件进行 UPX 处理的情况(例如,在编写 DLL、Windows 服务时或针对 Vista 或 Win7 时)? 我的大部分代码都是用 Delphi 编写的,但我也使用 UPX 来压缩 C/C++ 可执行文件。
顺便说一句,我不运行 UPX 是为了保护我的 exe 免受反汇编程序的侵害,只是为了减小可执行文件的大小并防止粗略的篡改。
I've used UPX before to reduce the size of my Windows executables, but I must admit that I am naive to any negative side effects this could have. What's the downside to all of this packing/unpacking?
Are there scenarios in which anyone would recommend NOT UPX-ing an executable (e.g. when writing a DLL, Windows Service, or when targeting Vista or Win7)? I write most of my code in Delphi, but I've used UPX to compress C/C++ executables as well.
On a side note, I'm not running UPX in some attempt to protect my exe from disassemblers, only to reduce the size of the executable and prevent cursory tampering.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(13)
http://www.jrsoftware.org/striprlc.php#execomp
http://www.jrsoftware.org/striprlc.php#execomp
我很惊讶这一点尚未被提及,但使用 UPX 打包的可执行文件也会增加启发式防病毒软件产生误报的风险,因为从统计数据来看,许多恶意软件也使用 UPX。
I'm surprised this hasn't been mentioned yet but using UPX-packed executables also increases the risk of producing false-positives from heuristic anti-virus software because statistically a lot of malware also uses UPX.
存在三个缺点:
因此,如果您的 EXE 或 DLL 包含大量资源,上述缺点就更成问题,但除此之外,考虑到可执行文件和可用内存的相对大小,它们在实践中可能不是一个重要因素,除非您谈论的是 DLL被许多可执行文件(如系统 DLL)使用。
消除其他答案中的一些不正确信息:
There are three drawbacks:
Thus the above drawbacks are more of an issue if your EXE or DLLs contains lots of resources, but otherwise, they may not be much of a factor in practice, given the relative size of executables and available memory, unless you're talking of DLLs used by lots of executables (like system DLLs).
To dispell some incorrect information in other answers:
唯一影响大小的时间是从互联网下载期间。 如果您使用 UPX,那么您实际上会比使用 7-zip 获得更差的性能(基于我测试的 7-Zip 是 UPX 的两倍)。 然后,当它实际上在目标计算机上被压缩时,您的性能就会下降(请参阅拉尔斯的回答)。 所以UPX对于文件大小来说并不是一个好的解决方案。 只需将整个内容压缩 7zip 即可。
就防止篡改而言,它也是失败。 UPX也支持解压。 如果有人想修改 EXE,那么他们会看到它是用 UPX 压缩的,然后解压缩。 您可能减慢的破解者的百分比并不能证明付出的努力和性能损失是合理的。
更好的解决方案是使用二进制签名或至少只是哈希。 一个简单的哈希验证系统是获取二进制文件的哈希值和秘密值(通常是 guid)。 只有您的 EXE 知道秘密值,因此当它重新计算哈希值进行验证时,它可以再次使用它。 这并不完美(可以检索秘密值)。 理想的情况是使用证书和签名。
The only time size matters is during download off the Internet. If you are using UPX then you actually get worse performance than if you use 7-zip (based on my testing 7-Zip is twice as good as UPX). Then when it is actually left compressed on the target computer your performance is decreased (see Lars' answer). So UPX is not a good solution for file size. Just 7zip the whole thing.
As far as to prevent tampering, it is a FAIL as well. UPX supports decompressing too. If someone wants to modify the EXE then they will see it is compress with UPX and then uncompress it. The percentage of possible crackers you might slow down does not justify the effort and performance loss.
A better solution would be to use binary signing or at least just a hash. A simple hash verification system is to take a hash of your binary and a secret value (usually a guid). Only your EXE knows the secret value, so when it recalculates the hash for verification it can use it again. This isn't perfect (the secret value can be retrieved). The ideal situation would be to use a certificate and a signature.
如今,磁盘上可执行文件的最终大小基本上无关紧要。 您的程序加载速度可能会快几毫秒,但一旦开始运行,差异就无法区分。
有些人可能会因为您的可执行文件是用 UPX 压缩而更加怀疑它。 根据您的最终用户,这可能是也可能不是一个重要的考虑因素。
The final size of the executable on disk is largely irrelevant these days. Your program may load a few milliseconds faster, but once it starts running the difference is indistinguishable.
Some people may be more suspicious of your executable just because it is compressed with UPX. Depending on your end users, this may or may not be an important consideration.
上次我尝试在托管程序集上使用它时,它的性能非常糟糕,以至于运行时拒绝加载它。 这是我唯一一次想到你不想使用它(而且,实际上,自从我尝试以来已经很久了,现在情况可能会更好)。 我过去在所有类型的非托管二进制文件上广泛使用它,并且从未遇到过问题。
The last time I tried to use it on a managed assembly, it munged it so bad that the runtime refused to load it. That's the only time I can think of that you wouldn't want to use it (and, really, it's been so long since I tried that that the situation may even be better now). I've used it extensively in the past on all types of unmanaged binaries, and never had an issue.
如果您唯一感兴趣的是减小可执行文件的大小,那么您是否尝试过比较带运行时包和不带运行时包的可执行文件的大小? 当然,您还必须包括整个包的大小以及可执行文件,但如果您有多个使用相同基础包的可执行文件,那么您的节省会相当高。
另一件需要注意的事情是您在程序中使用的图形/字形。 通过将它们合并到全局数据模块中包含的单个 Timagelist,而不是在每个表单上重复它们,可以节省相当多的空间。 我相信每个图像都以十六进制形式存储在表单资源中,因此这意味着每个字节占用两个字节...您可以通过使用 TResourceStream 从 RCData 资源加载图像来缩小这一点。
If your only interest is in decreasing the size of the executables, then have you tried comparing the size of the executable with and without runtime packages? Granted you will have to also include the sizes of the packages overall along with your executable, but if you have multiple executables which use the same base packages, then your savings would be rather high.
Another thing to look at would be the graphics/glyphs you use in your program. You can save quite a bit of space by consolidating them to a single Timagelist included in a global data module rather than have them repeated on each form. I believe each image is stored in the form resource as hex, so that would mean that each byte takes up two bytes...you can shrink this a bit by loading the image from a RCData resource using a TResourceStream.
没有任何缺点。
但仅供参考,关于 UPX 有一个非常常见的误解,即
资源不仅仅是被压缩,
本质上您正在构建一个具有“加载器”职责的新可执行文件,而“真正的”可执行文件正在被部分剥离和压缩,作为加载器可执行文件的二进制数据资源放置(无论原始可执行文件中的资源类型如何)。
使用逆向工程方法和工具用于教育目的或其他显示有关“加载器可执行文件”的信息,而不是有关原始可执行文件的变量信息。
There are no drawbacks.
But just FYI, there is a very common misconception regarding UPX as--
resources are NOT just being compressed
Essentially you are building a new executable that has a "loader" duty and the "real" executable, well, is being section-stripped and compressed, placed as a binary-data resource of the loader executable (regardless the types of resources were in the original executable).
Using reverse-engineering methods and tools either for education purposes or other will show you the information regarding the "loader executable", and not variable information regarding the original executable.
恕我直言,常规的 UPXing 是没有意义的,但原因在上面已经阐明,主要是内存比磁盘更昂贵。
Erik:LZMA 存根可能更大。 即使算法更好,也并不总是净利。
IMHO routinely UPXing is pointless, but the reasons are spelled above, mostly, memory is more expensive than disk.
Erik: the LZMA stub might be bigger. Even if the algorithm is better, it does not always be a net plus.
查找“未知”病毒的病毒扫描程序可以将 UPX 压缩的可执行文件标记为带有病毒。 有人告诉我这是因为一些病毒使用 UPX 来隐藏自己。 我在软件上使用了 UPX,McAfee 会将文件标记为有病毒。
Virus scanners that look for 'unknown' viruses can flag UPX compressed executables as having a virus. I have been told this is because several viruses use UPX to hide themselves. I have used UPX on software and McAfee will flag the file as having a virus.
UPX 之所以出现如此多的误报,是因为它的开放许可允许恶意软件作者使用和修改它而不受惩罚。 当然,这个问题是行业固有的,但遗憾的是伟大的 UPX 项目却受到这个问题的困扰。
更新:请注意,随着 Taggant 项目的完成,假设 UPX 支持,使用 UPX(或其他任何东西)而不导致误报的能力将会增强。
The reason UPX has so many false alarms is because its open licensing allows malware authors to use and modify it with impunity. Of course, this issue is inherent to the industry, but sadly the great UPX project is plagued by this problem.
UPDATE: Note that as the Taggant project is completed, the ability to use UPX (or anything else) without causing false positives will be enhanced, assuming UPX supports it.
我相信它可能无法在具有 DEP(数据执行保护)的计算机上运行) 打开。
I believe there is a possibility that it might not work on computers that have DEP (Data Execution Prevention) turned on.
当 Windows 加载二进制文件时,它所做的第一件事就是导入/导出表解析。 即,无论导入表中指示什么API和DLL,它都会首先将DLL加载到随机生成的基地址中。 并且使用基地址加上DLL函数的偏移量,该信息将被更新到导入表中。
EXE没有导出表。
所有这些甚至在跳转到原始入口点执行之前就发生了。
然后从入口点开始执行后,EXE将在开始解压缩算法之前运行一小段代码。 这段小代码也意味着所需的 Windows API 会非常小,从而导致导入表很小。
但是在二进制文件解压后,如果它开始使用任何之前未解析的 Windows API,那么它很可能会崩溃。 因此,在执行解压缩代码之前,解压缩例程必须解析并更新解压缩代码内所有引用的 Window API 的导入表。
参考文献:
https ://malwaretips.com/threads/malware-analysis-2-pe-imports-static-analysis.62135/
When Windows load a binary, first thing it does is called Import/Export Table resolution. Ie, what ever API and DLL that is indicated in the Import Table, it will first load the DLL into a randomly generated base address. And using the base address plus offset into the DLL's function, this information will be updated to the Import Table.
EXE does not have Export Table.
All these happened even before jumping to the original entry point for execution.
Then after it start executing from the entry point, the EXE will run a small piece of code before starting the decompression algorithm. This small piece of code also means that the Windows API needed will be very small, resulting in a small Import Table.
But after the binary is decompressed, if it started to use any Windows API not resolved before, then likely it is going to crash. So it is essential that the decompression routine will resolve and update the Import Table for all the referenced Window API inside the decompressed codes, before executing the decompressed codes.
References:
https://malwaretips.com/threads/malware-analysis-2-pe-imports-static-analysis.62135/