混淆的效果如何？

发布于 2024-07-13 21:03:48 字数 1184 浏览 5 评论 0原文

另一个问题，即 Best .NET 混淆工具/策略，询问混淆是否容易使用工具来实现。

我的问题是，混淆有效吗？在回复这个答案，有人说“如果你担心源代码被盗……混淆对于真正的破解者来说几乎是微不足道的”。

我查看了 Dotfuscator 社区版的输出：它对我来说看起来很混乱！我不想维持这个！

我知道简单地“破解”混淆的软件可能相对容易：因为您只需要找到软件中实现您想要破解的任何内容的位置（通常是许可证保护），然后添加跳转以跳过它。

如果担心的不仅仅是最终用户或“盗版者”的破解：如果担心的是“源盗窃”，即如果您是软件供应商，而您担心的是另一个供应商（潜在的竞争对手）的反向-设计你的源代码，然后他们可以在自己的产品中使用或添加到他们自己的产品中……简单的混淆在多大程度上足以或不充分地防范这种风险？

第一次编辑：

有问题的代码大约是 20 KLOC，它在最终用户计算机上运行（用户控件，而不是远程服务）。

如果混淆确实“对于真正的黑客来说几乎是微不足道的”，我想深入了解它为什么无效（而不仅仅是“无效的程度”）。

第二次编辑：

我并不担心有人逆转算法：更担心他们将算法的实际实现（即源代码）重新利用到他们自己的产品中。

考虑到开发 20 KLOC 需要几个月的时间，那么将其全部反混淆需要比这更多还是更少的时间（几个月）？

是否有必要对某些东西进行反混淆以便“窃取”它：或者一个理智的竞争对手可能会在仍然混淆的情况下将其批发到他们的产品中，接受原样这是一个维护噩梦，并希望它几乎不需要维护？如果这种情况存在，那么混淆的.Net 代码是否比编译的机器代码更容易受到这种情况的影响？

大多数混淆“军备竞赛”的主要目的是否是防止人们“破解”某些东西（例如查找并删除实现许可保护/执行的代码片段），而不是防止“源代码盗窃”？

原文

A different question, i.e. Best .NET obfuscation tools/strategy, asks whether obfuscation is easy to implement using tools.

My question though is, is obfuscation effective? In a comment replying to this answer, someone said that "if you're worried about source theft ... obfuscation is almost trivial to a real cracker".

I've looked at the output from the Community Edition of Dotfuscator: and it looks obfuscated to me! I wouldn't want to maintain that!

I understand that simply 'cracking' obfuscated software might be relatively easy: because you only need to find whichever location in the software implements whatever it is you want to crack (typically the license protection), and add a jump to skip that.

If the worry is more than just cracking by an end-user or a 'pirate' though: if the worry is "source theft" i.e. if you're a software vendor, and your worry is another vendor (a potential competitor) reverse-engineering your source, which they could then use in or add to their own product ... to what extent is simple obfuscation an adequate or inadequate protection against that risk?

1st edit:

The code in question is about 20 KLOC which runs on end-user machines (a user control, not a remote service).

If obfuscation really is "almost trivial to a real cracker", I'd like some insight into why it's ineffective (and not just "how much" it's not effective).

2nd edit:

I'm not worried about someone's reversing the algorithm: more worried about their repurposing the actual implementation of the algorithm (i.e. the source code) into their own product.

Figuring that 20 KLOC is several month's work to develop, would it take more or less than this (several months) to deobfuscate it all?

Is it even necessary to deobfuscate something in order to 'steal' it: or might a sane competitor simply incorporate it wholesale into their product while still obfuscated, accept that as-is it's a maintenance nightmare, and hope that it needs little maintenance? If this scenario is a possibility then is obfuscated .Net code any more vulnerable to this than compiled machine code is?

Is most of the obfuscation "arms race" aimed mostly at preventing people people from even 'cracking' something (e.g. finding and deleting the code fragment which implements licensing protection/enforcement), more than at preventing 'source theft'?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

终陌 2024-07-20 21:03:48

我已经讨论了为什么我不认为混淆是防止破解的有效手段：
保护.NET代码免受逆向工程

但是，您的问题是特别是关于源盗窃，这是一个有趣的话题。在 Eldad Eiliams 的书“逆向：逆向工程的秘密”中，作者在前两章中讨论了源盗窃作为逆向工程背后的原因之一。

基本上，归根结底，你成为源盗窃目标的唯一机会是，如果你有一些与你的领域相关的非常具体的、难以设计的算法，这些算法可以让你在竞争中占据优势。这几乎是唯一一次尝试对应用程序的一小部分进行逆向工程具有成本效益的情况。

因此，除非您拥有一些您不希望竞争对手拥有的绝密算法，否则您无需担心源代码被盗。从应用程序中反转任何大量源代码所涉及的成本很快就会超过从头开始重写它的成本。

即使您确实有一些您不希望他们拥有的算法，您也无法采取太多措施来阻止意志坚定且技术熟练的个人获取它（如果应用程序正在他们的计算机上执行）。

一些常见的反逆向措施包括：

混淆 - 在保护源代码或防止其被破解方面没有多大作用。但我们也可能不会让它变得完全容易，对吗？
3rd Party Packers - Themida 是较好的之一。将可执行文件打包到加密的 win32 应用程序中。如果应用程序也是 .NET 应用程序，则防止反射。
自定义加壳器 - 如果您有能力的话，有时编写自己的加壳器是有效的，因为在破解场景中关于如何解压应用程序的信息非常少。这可以阻止没有经验的 RE。这个教程提供了一些有关编写自己的加壳器的有用信息。
让行业秘密算法远离用户机器。将它们作为远程服务执行，因此指令永远不会在本地执行。唯一“万无一失”的保护方法。

然而，加壳程序可以被解包，并且混淆并不会真正妨碍那些想要查看您的应用程序正在做什么的人。如果该程序在用户计算机上运行，则它很容易受到攻击。

最终，它的代码必须作为机器代码执行，通常需要启动调试器、设置一些断点并监视相关操作期间正在执行的指令，并花一些时间仔细研究这些数据。

您提到您花了几个月的时间为您的应用程序编写约 20kLOC。如果您采取最低限度的预防措施，则将应用程序中的等效 20kLOC 反转为可用源需要几乎一个数量级的时间。

这就是为什么从您的应用程序中逆向小型的、行业特定的算法才具有成本效益。别的什么，都不值得。

以下面虚构的例子为例：假设我刚刚为 iTunes 开发了一款全新的竞争应用程序，该应用程序有大量的附加功能。假设需要几个 100k LOC 和 2 年的时间来开发。我的一个关键功能是一种根据您的音乐聆听品味向您提供音乐的新方式。

苹果（他们就是盗版者）听到了这一消息，并认为他们真的很喜欢你的音乐推荐功能，所以他们决定扭转它。然后，他们将只专注于该算法，逆向工程师最终将提出一种可行的算法，在给定相同数据的情况下提供等效的建议。然后他们在自己的应用程序中实现上述算法，称之为“天才”，并赚取下一个 10 万亿美元。

这就是源盗窃减少的原因。

没有人会坐在那里反转所有 100k LOC 来窃取已编译应用程序的大量内容。这样做成本太高，而且太耗时。大约 90% 的情况下，他们会逆向那些无聊的、非行业秘密的代码，这些代码只是处理按钮按下或处理用户输入。相反，他们可以聘请自己的开发人员以更少的钱从头开始重写大部分内容，并简单地反转难以设计但给你带来优势的重要算法（即音乐建议功能）。

I've discussed why I don't think Obfuscation is an effective means of protection against cracking here:
Protect .NET Code from reverse engineering

However, your question is specifically about source theft, which is an interesting topic. In Eldad Eiliams book, "Reversing: Secrets of Reverse Engineering", the author discusses source theft as one reason behind reverse engineering in the first two chapters.

Basically, what it comes down to is the only chance you have of being targeted for source theft is if you have some very specific, hard to engineer, algorithm related to your domain that gives you a leg up on your competition. This is just about the only time it would be cost-effective to attempt to reverse engineer a small portion of your application.

So, unless you have some top-secret algorithm you don't want your competition to have, you don't need to worry about source theft. The cost involved with reversing any significant amount of source-code out of your application quickly exceeds the cost of re-writing it from scratch.

Even if you do have some algorithm you don't want them to have, there isn't much you can do to stop determined and skilled individuals from getting it anyway (if the application is executing on their machine).

Some common anti-reversing measures are:

Obfuscating - Doesn't do much in terms of protecting your source or preventing it from being cracked. But we might as well not make it totally easy, right?
3rd Party Packers - Themida is one of the better ones. Packs an executable into an encrypted win32 application. Prevents reflection if the application is a .NET app as well.
Custom Packers - Sometimes writing your own packer if you have the skill to do so is effective because there is very little information in the cracking scene about how to unpack your application. This can stop inexperienced RE's. This tutorial gives some good information on writing your own packer.
Keep industry secret algorithms off the users machine. Execute them as a remote service so the instructions are never executed locally. The only "fool-proof" method of protection.

However, packers can be unpacked, and obfuscation doesn't really hinder those who want to see what you application is doing. If the program is run on the users machine then it is vulnerable.

Eventually its code must be executed as machine code and it is normally a matter of firing up debugger, setting a few breakpoints and monitoring the instructions being executed during the relevant action and some time spent poring over this data.

You mentioned that it took you several months to write ~20kLOC for your application. It would take almost an order of magnitude longer to reverse those equivalent 20kLOC from your application into workable source if you took the bare minimum precautions.

This is why it is only cost-effective to reverse small, industry specific algorithms from your application. Anything else and it isn't worth it.

Take the following fictionalized example: Lets say I just developed a brand new competing application for iTunes that had a ton of bells and whistles. Let say it took several 100k LOC and 2 years to develop. One key feature I have is a new way of serving up music to you based off your music-listening taste.

Apple (being the pirates they are) gets wind of this and decides they really like your music suggest feature so they decide to reverse it. They will then hone-in on only that algorithm and the reverse engineers will eventually come up with a workable algorithm that serves up the equivalent suggestions given the same data. Then they implement said algorithm in their own application, call it "Genius" and make their next 10 trillion dollars.

That is how source theft goes down.

No one would sit there and reverse all 100k LOC to steal significant chunks of your compiled application. It would simply be too costly and too time consuming. About 90% of the time they would be reversing boring, non-industry-secretive code that simply handled button presses or handled user input. Instead, they could hire developers of their own to re-write most of it from scratch for less money and simply reverse the important algorithms that are difficult to engineer and that give you an edge (ie, music suggest feature).

回复收藏 0 原文

好倦 2024-07-20 21:03:48

混淆是通过模糊实现安全的一种形式，虽然它提供了一些保护，但安全性显然是相当有限。

出于您所描述的目的，模糊性肯定会有所帮助，并且在许多情况下，可以充分保护代码被盗的风险。然而，如果有足够的时间和精力，代码肯定仍然存在“未混淆”的风险。消除整个代码库的混淆实际上是不可能的，但如果感兴趣的一方只想确定您如何完成实现的某些特定部分，则风险会更高。

最后，只有您才能确定您或您的企业是否值得冒这个风险。但是，在许多情况下，如果您希望将产品出售给客户以在他们自己的环境中使用，这是您唯一的选择。

关于“为什么它无效” - 原因是因为无论使用什么混淆技术，破解者都可以使用调试器来查看代码在哪里运行。然后，他们可以使用它来解决您设置的任何保护机制，例如序列号或“电话主页”系统。

我不认为该评论实际上是指“代码盗窃”，因为您的代码将被窃取并在另一个项目中使用。因为他们使用了“cracker”这个词，我相信他们谈论的是软件盗版方面的“盗窃”。破解者专门研究保护机制；他们对将您的源代码用于其他目的不感兴趣。

回复收藏 0 原文

嘴硬脾气大 2024-07-20 21:03:48

大多数人倾向于编写看似模糊的代码，但这并没有阻止破解者，那么有什么区别呢？

编辑：

好的，严肃的时间。如果您确实想制作一些难以破坏的东西，请研究多态编码（不要与多态性混淆）。编写能够自我变异的代码，破坏代码是一件非常痛苦的事情，并且会让他们不断猜测。

http://en.wikipedia.org/wiki/Polymorphic_code

最后，没有什么是不可能的反向工程。

回复收藏 0 原文

第七度阳光i 2024-07-20 21:03:48

您担心人们窃取您产品中使用的特定算法。要么你是Fair Isaac，要么你需要使用 x++; 以外的方式来让自己脱颖而出。如果您解决了代码中的某些问题，而其他人花几个小时无法解决该问题，那么您应该拥有计算机科学博士学位和/或专利来保护您的发明。 99% 的软件产品不因为算法而不成功或特殊。它们之所以成功，是因为它们的作者付出了艰辛的努力，将众所周知且易于理解的概念整合到一个产品中，满足客户的需求，并以比支付其他人重复做同样产品的价格更便宜的价格出售该产品。

回复收藏 0 原文

你的呼吸 2024-07-20 21:03:48

这样看； SO 团队对您输入问题的 WMD 编辑器进行了逆向工程，以修复一些错误并进行一些增强。该代码被混淆了。你永远无法阻止聪明的人攻击你的代码，你所能期望的最好的结果就是让诚实的人保持诚实并使其难以被破坏。

回复收藏 0 原文

沧桑㈠ 2024-07-20 21:03:48

如果您曾经看过反汇编程序的输出，您就会意识到为什么混淆总是会失败。

回复收藏 0 原文

剪不断理还乱 2024-07-20 21:03:48

我倾向于认为，如果你想保护你的源代码，混淆实际上并不是很有效。对于该领域真正的专家（我不是指这里的软件专家或破解者，我的意思是代码功能领域的专家），通常他或她不需要看到代码，只需查看它如何对特殊输入、边缘情况等做出反应，以了解如何实现与该受保护功能等效的副本或代码。因此，这对于保护您的专有技术没有多大帮助。

回复收藏 0 原文

泅渡 2024-07-20 21:03:48

如果您拥有必须不惜一切代价保护的代码中的 IP，那么您应该在安全的远程服务器上将软件的功能作为服务提供。

良好的混淆可以在一定程度上保护您，但这完全取决于破解它所需的努力与拥有代码的“奖励”。如果您正在谈论阻止普通商业用户，那么商业混淆器就足够了。

回复收藏 0 原文

南…巷孤猫 2024-07-20 21:03:48

简短的回答是“是”和“否”；这完全取决于您想要阻止什么。 Secure Programming Cookbook 在第 653 页对此有一些有趣的评论（不方便获取）在谷歌图书预览中）。它将防篡改分为四类：零日（减慢攻击者的速度，使他们需要很长时间才能完成他们想要的事情）、保护专有算法以防止逆向工程、“因为我可以”攻击并且我可以'不记得第四个了。你必须问我想阻止什么，如果你真的担心有人查看你的源代码，那么混淆就有一定的价值。单独使用它通常只会让试图破坏您的应用程序的人感到烦恼，并且像任何良好的安全措施一样，与其他防篡改技术结合使用时效果最佳。

回复收藏 0 原文

~没有更多了~