各种语言的代码混淆用法
我最近了解了代码混淆。当你有空闲时间的时候,这是件好事,但我有不同的问题。为什么要做呢?
首先,我确信有些语言有其伟大之处——解释型语言,例如 php、JavaScript 等等。在那里,这似乎是一件好事,而且更安全。
其次,有些语言似乎对我没有真正的影响 - 所有本机代码编译语言。以C为例。编译后,所有变量名称、函数名称和大多数混淆技术都会消失。如果有些可以将其变成本机代码,那就是递归而不是 for 循环之类的东西,但是反汇编代码无论如何都会有一些反汇编器生成的标识符而不是名称,对吧?
最后一类是我不太确定的语言。这就是我问的主要原因。这些语言将是 Java、C# (.NET) 以及 WP7 中使用的最后一个 Silverlight。我问这个问题是因为我读过一些文章,指出在 WP7 应用程序上,代码混淆有助于防止代码被黑客攻击。但我一直认为字节码与标准汇编代码非常相似,因此同样没有任何关于真正的预编译变量名、函数名等的信息。那么,真相在哪里呢?
I recently learned about code obfuscation. Its nice thing to do, when you have spare time, but I have different question. Why to do it?
First, there are languages in which I am sure its great thing - interpreted ones, like php, JavaScript and much more. There it seems like a good and more secure thing.
Second, there are languages where this seems to have no real effect for me - all the native code compiled languages. Take C for example. when compiled, all the variable names, function names, most of obfuscation techniques go away. If some can make it into native code, it would be things like recursion instead of for cycles and so, but disassembled code will anyway have instead of names some disassembler-generated identifiers, right?
And last category are languages I am not quite sure about. And that's the main reason I ask. These languages would be Java, C# (.NET),and the last Silverlight used in WP7. I ask because I read some article that state that on WP7 apps, code obfuscation helps preventing code from hacking. But I always thought of byte-code as being very similar to standard assembler codes, therefore again not having any information about real pre-compilation variable names, function names, etc. So, where is the truth?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果你愿意,就去做吧,但不要指望任何有决心的人会被它吓跑。存在反混淆器,人们也可以阅读混淆的代码(就像有人可以阅读优化的汇编并重建原始的 C 代码一样)。代码混淆只会给您一种错误的安全感,并且可能会阻止只是好奇的人(而不是阻止那些认真想要窃取您代码的人)。它给你的只是一种虚假的安全感,而不是真正的安全感。施奈尔恰当地将这个命名为“安全剧院”。
是的,许多保留更多有关源代码的信息的现代语言可以比那些直接编译为机器代码的语言更好地进行混淆。对于后者,编译器已经在优化方面做得很好。不过,您关于字节码类似于传统汇编程序的概念在这里略有错误。特别是 .NET 字节码保留了足够的元数据来几乎准确地重建原始源(请参阅 Reflector)。不保留的是局部变量的名称和方法的参数。但您仍然需要并保留方法和类名称。
您应该注意的另一个问题:如果您为客户提供了混淆的可执行文件并且您的程序崩溃了,请确保您有一种方法可以获取真正的堆栈跟踪而不是混淆的堆栈跟踪。我想,说“抱歉,我无法确定为什么我的程序杀死了你几个小时的工作的根本原因,因为我选择对其进行混淆”并不能解决问题:-)
Do it if you want, but don't expect any determined person to be scared away by it. There exist de-obfuscators, people can read obfuscated code as well (just as there are people who can read optimized assembly and reconstruct the original C code). Code obfuscation just gives you a false sense of security and might deter a person who is just curious (instead of deterring those who are serious about stealing your code). All it gives you is a false sense of security but no real one. Schneier aptly names this "security theater".
Yes, many modern languages that retain more information about the source can be obfuscated better than those that are compiled right to machine code. For the latter the compiler already does quite a good job with optimization. Your notion of bytecode being akin to traditional assembler is slightly wrong here, though. Especially .NET bytecode retains enough metadata to reconstruct the original source almost exactly (see Reflector). What isn't retained there are the names of local variables and arguments to methods. But you still need and retain the method and class names.
Another issue you should be aware of: If you give your customers an obfuscated executable and your program crashes, make sure you have a way of getting the real stacktrace back instead of the obfuscated one. Saying "Sorry, I cannot determine the root cause of why my program killed hours of your work since I chose to obfuscate it" isn't going to cut it, I guess :-)
对于有硬件限制的移动应用程序来说,混淆是一种常见技术。混淆的代码往往具有较短的标识符,因此二进制文件较小。
Obfuscation is a common technique for mobile applications where you have hardware restrictions. Obfuscated code tends to have shorter identifiers and therefore smaller binaries.