如何在可执行代码中嵌入字符串?
我有一个字符串文字,在我的可执行文件周围的许多不同地方使用。
让我们这样说:
const char *formatString = "Something I don't want to make obvious: %d";
int format1(char *buf) { sprintf(buf, formatString, 1); }
int format2(char *buf) { sprintf(buf, formatString, 2); }
//...
现在,这个字符串文字在可执行代码中变得非常明显,因为它是按字面嵌入的。
有什么方法可以避免这种情况,例如强制编译器生成汇编指令(例如 mov [ptr + 4], 0x65)指令来创建字符串,而不是按字面嵌入字符串?
我不想进行任何类型的混淆——我只是想避免使可执行文件中的字符串变得明显。 (我也不想在使用字符串的每个地方都修改我的代码。)
这可能吗?
I have a string literal that's used in a number of different places around my executable.
Let's say something like:
const char *formatString = "Something I don't want to make obvious: %d";
int format1(char *buf) { sprintf(buf, formatString, 1); }
int format2(char *buf) { sprintf(buf, formatString, 2); }
//...
Now, this string literal becomes very obvious inside the executable code, because it's embedded literally.
Is there any way to avoid this by forcing the compiler to, for example, generate assembly instructions (e.g. mov [ptr + 4], 0x65
) instructions to create the strings, instead of embedding the strings literally?
I don't want to do an obfuscation of any sort -- I simply want to avoid making the string obvious inside the executable. (I also don't want to have to modify my code in every single place the string is used.)
Is this possible?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
为了避免手动将加密字符串粘贴到代码中,您可以创建一个宏来标记需要混淆的字符串,并创建一个解密它们的函数:
该函数必须执行两件事:
如果参数以 START_MARK_GUID 开头,则只需返回原始值字符串(没有指南)。这将允许您也使用未混淆的可执行文件,例如在调试时。
如果以 ENCRYPTED_MARK_GUID 开头,则首先进行反混淆,然后返回一个新字符串。在 C 语言中,你必须关心内存的生命周期;在 C++ 中,您可以简单地返回 std::string()。
最后,创建一个混淆器程序,在编译的二进制文件中查找 GUID 并加密它们之间的数据。 Python 或类似语言中只有几行。我还建议修复 EXE 的 CRC,尽管我的程序即使没有它也能工作。
您可以更改唯一标识符较少的 guid 以节省一些空间。您还可以改进这一点,使解密只发生一次(例如,在 ENCRYPTED_MARK_GUID 中写入字符串 ID,并通过该 ID 将解密的字符串保存在字典中)。
To avoid pasting encrypted strings into code by hand, you can create a macro which would mark strings that need obfuscation and a function which decrypts them:
The function must do two things:
If the parameter starts with START_MARK_GUID, simply return the original string (without the guids). This will allow you to use unobfuscated executable too, e.g. when debugging.
If it starts with ENCRYPTED_MARK_GUID, deobfuscate first then return a new string. In C you will have to care about memory lifetime here; in C++ you could simply return std::string().
Finally, create an obfuscator program which looks for GUIDs in a compiled binary and encrypts data between them. It is only a few lines in Python or similar language. I also recommend to fix EXE's CRC back after, though my program worked even without that.
You can change guids with less unique identifiers to save some space. Also you can improve this to make decryption happen only one time (e.g. write string ID among with ENCRYPTED_MARK_GUID and keep decrypted strings in a dictionary by that ID).
混淆可能是你最好的选择。使用简单的混淆(例如 XOR),并在运行任何需要该字符串的代码之前将其取消混淆到程序开头的另一个变量中。
Obfuscation is probably your best bet. Use a simple obfuscation (such as XOR), and unobfuscate it into another variable at the beginning of your program before running any code that needs the string.
很快,在 C++11 中,您将能够使用用户定义的文字。
gcc 正在努力解决这个问题,我认为 IBM 已经已经这样了。不确定 Visual Studio 的状态。这是在修补过的 gcc 上编译的。
Pretty soon with C++11 you'll be able to use user-defined literals.
gcc is working on this furiously and I think IBM has this already. Not sure about the status of Visual Studio. This compiled on a patched gcc.