因此,我的想法是“提起” 64位Windows可以执行到LLVM比特码(或任何高于汇编),然后将其编译回32位可执行文件。
我发现 retdec and mcsema 可以将PE二进制提升到LLVM IR(以及可选的C),但是McSema需要IDA Pro,所以我还没有尝试过。
我已经安装了MSVC V143
和Windows SDK版本 10.0.19041.0
:
clang版本:
clang version 13.0.1 (https://github.com/llvm/llvm-project 75e33f71c2dae584b13a7d1186ae0a038ba98838)
Target: x86_64-pc-windows-msvc
Thread model: posix
所以我使用clang在c中编译了此Hello World Code:
#include <stdio.h>
int main()
{
printf("Hello, world!\n");
}
然后> clang hello.c -o hello.exe
检查 hello.exe
wsl:
$ file hello.exe
hello.exe: PE32+ executable (console) x86-64, for MS Windows
您可以下载
Then I use RetDec to lift it to LLVM IR:
python retdec-decompiler.py --no-memory-limit hello.exe
Output: here
之后我们得到了:
编译比特代码返回可执行文件:
clang hello.exe.bc -m32 -v -Wl,/SUBSYSTEM:CONSOLE -Wl,/errorlimit:0 -fuse-ld=lld -o hello.x86.exe
在这里
我猜我想诸如 _writeConsolew
之类的功能是win32 apis,但是 ____________dempiler_undefined_function_0
可能是从empompiler中生成的。
另外,分解代码没有 main
函数,但是它具有 entry_point
函数。来自 hello.exe.ll
:
main :
以及 hello.exe.c
没有 ____________undefined_functine_function_0
我也尝试使用<<代码> lli :
lli --entry-function=entry_point hello.exe.bc
输出:此处
在这里
如何进行编译?谢谢!
So my idea is to "lift" 64-bits Windows executable to LLVM bitcode (or whatever is higher than assembly) and then compile it back to 32-bit executable.
I found that RetDec and McSema can lift PE binary to LLVM IR (and optionally C), but McSema requires IDA pro so I haven't tried it yet.
I have installed MSVC v143
and Windows SDK version 10.0.19041.0
:
data:image/s3,"s3://crabby-images/04810/04810f5df65f0b5d558ae27e07d1ed0f05cd1b46" alt="vs"
Clang version:
clang version 13.0.1 (https://github.com/llvm/llvm-project 75e33f71c2dae584b13a7d1186ae0a038ba98838)
Target: x86_64-pc-windows-msvc
Thread model: posix
So I compile this Hello World code in C using Clang:
#include <stdio.h>
int main()
{
printf("Hello, world!\n");
}
then clang hello.c -o hello.exe
Check hello.exe
file type with WSL:
$ file hello.exe
hello.exe: PE32+ executable (console) x86-64, for MS Windows
You can download it here.
Then I use RetDec to lift it to LLVM IR:
python retdec-decompiler.py --no-memory-limit hello.exe
Output: here
After that we get:
data:image/s3,"s3://crabby-images/80ae1/80ae1ef33def655ecf4606c887f093eb4154b388" alt="files"
Compile bitcode back to executable:
clang hello.exe.bc -m32 -v -Wl,/SUBSYSTEM:CONSOLE -Wl,/errorlimit:0 -fuse-ld=lld -o hello.x86.exe
Output: here
I guess functions like _WriteConsoleW
are Win32 APIs, but ___decompiler_undefined_function_0
might be generated from the decompiler by some way.
Also, the decompiled code has no main
function, but it had entry_point
function. From hello.exe.ll
:
data:image/s3,"s3://crabby-images/16c70/16c707653323a25a88ff8a33b992cf91e3209d8f" alt="hello.exe.ll"
hello.exe.c
also has entry_point
instead of main
:
data:image/s3,"s3://crabby-images/c5ac2/c5ac292cacfc56b63dce4f7ea642d6aa3cdfca6a" alt="hello.exe.c"
And also, hello.exe.c
doesn't have ___decompiler_undefined_function_0
I also tried running the bitcode with lli
:
lli --entry-function=entry_point hello.exe.bc
Output: here
Here is the link to the files.
How to make this compile? Thanks!
发布评论
评论(1)
那很雄心勃勃。
我要肢体出去,说每个Windows应用程序都包含系统标头文件的千> -bit Systems和许多包含
#ifdef
或其他与平台相关的差异。您将拥有一个大的.ll文件,其中包含Windows64特定类型和代码。如果Microsoft的开发人员将Windows64视为丢弃W95代码所需的一些黑客的好机会,那么您也将在那里拥有W32不兼容的代码。
您要做的是 wine 开发人员做到了 - 又添加代码又添加了每个问题。将有成千上万的案件要处理。其中一些将非常困难。当您在.ll文件中看到数字128时,是原始源中的sizeof(this_w64_struct),sizeof(that_other_struct)还是其他东西?您应该将数字更改,如果是这样,什么?
您应该期望这个项目至少需要数年,甚至十年或更长时间。祝你好运。
That's very ambitious.
I'm going to go out on a limb and say that every windows application includes thousands of system header files, most of which use types whose size differs between 32- and 64-bit systems and many of which contains
#ifdef
or other platform-dependent differences. You'll have a large .ll file full of windows64-specific types and code.If the developers at Microsoft saw windows64 as a good chance to drop some hacks that were needed for w95 code, then you'll have w32-incompatible code there, too.
What you have to do is what the wine developers did — add code to cater to each problem in turn. There will be thousands of cases to handle. Some of it will be very difficult. When you see the number 128 in the .ll file, was it sizeof(this_w64_struct) in the original source, sizeof(that_other_struct) or something else entirely? Should you change the number, and if so, to what?
You should expect this project to take at least years, maybe a decade or more. Good luck.