LLVM jit 和本机
我不明白 LLVM JIT 与正常的无 JIT 编译有何关系,而且文档也不好。
例如,假设我使用 clang
前端:
- 案例 1:我使用 clang/llvm 将 C 文件编译为本机。我理解的这个流程就像 gcc 流程 - 我得到了我的 x86 可执行文件并运行。
- 案例 2:我编译成某种在 LLVM JIT 上运行的 LLVM IR。在这种情况下,可执行文件包含 LLVM 运行时来在 JIT 上执行 IR,或者它是如何工作的?
这两者有什么区别,它们是否正确? LLVM 流程是否同时支持 JIT 和非 JIT?我什么时候想使用 JIT——它对于像 C 这样的语言有意义吗?
I don't understand how LLVM JIT relates to normal no JIT compilation and the documentation isn't good.
For example suppose I use the clang
front end:
- Case 1: I compile C file to native with clang/llvm. This flow I understand is like gcc flow - I get my x86 executable and that runs.
- Case 2: I compile into some kind of LLVM IR that runs on LLVM JIT. In this case the executable contains the LLVM runtime to execute the IR on JIT, or how does it work?
What is the difference between these two and are they correct? Does LLVM flow include support for both JIT and non JIT? When do I want to use JIT - does it make sense at all for a language like C?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您必须了解 LLVM 是一个帮助您构建编译器的库。 Clang 只是这个库的前端。
Clang 将 C/C++ 代码翻译为 LLVM IR,并将其交给 LLVM,LLVM 将其编译为本机代码。
LLVM 还能够直接在内存中生成本机代码,然后可以将其作为普通函数进行调用。因此,案例 1. 和案例 2. 共享 LLVM 的优化和代码生成。
那么如何使用 LLVM 作为 JIT 编译器呢?您构建一个生成一些 LLVM IR(在内存中)的应用程序,然后使用 LLVM 库生成本机代码(仍在内存中)。 LLVM 会返回一个指针,您可以稍后调用该指针。没有涉及叮当声。
但是,您可以使用 clang 将一些 C 代码转换为 LLVM IR 并将其加载到 JIT 上下文中以使用这些函数。
真实示例:
还有 万花筒教程展示了如何使用 JIT 编译器实现简单的语言。
You have to understand that LLVM is a library that helps you build compilers. Clang is merely a frontend for this library.
Clang translates C/C++ code into LLVM IR and hands it over to LLVM, which compiles it into native code.
LLVM is also able to generate native code directly in memory, which then can be called as a normal function. So case 1. and 2. share LLVM's optimization and code generation.
So how does one use LLVM as a JIT compiler? You build an application which generates some LLVM IR (in memory), then use the LLVM library to generate native code (still in memory). LLVM hands you back a pointer which you can call afterwards. No clang involved.
You can, however, use clang to translate some C code into LLVM IR and load this into your JIT context to use the functions.
Real World examples:
There is also the Kaleidoscope tutorial which shows how to implement a simple language with JIT compiler.
首先,您获得 LLVM 字节码 (LLVM IR):
其次,您使用 LLVM JIT:
运行程序。
然后,如果您希望获得本机,则可以使用 LLVM 后端:
从程序集输出:
First, you get LLVM bytecode (LLVM IR):
Second, you use LLVM JIT:
That runs the program.
Then, if you wish to get native, you use LLVM backend:
From the assembly output:
我正在采取步骤从 LLVM 社区的邮件消息中编译和运行 JIT 代码。
[LLVMdev] MCJIT 和 Kaleidscope 教程
头文件:
和简单 foo() 函数的函数:
以及 main 函数:
使用 foo.c 构建共享库:
为 main.c 文件生成 LLVM 位码:
并通过 jit(和 MCJIT)运行 LLVM 位码以获得所需的结果输出:
您还可以将 clang 输出通过管道传输到 lli:
获得的输出
源
从共享 Linux 上的 GCC 库
I am taking the steps to compile and run the JIT'ed code from a mail message in LLVM community.
[LLVMdev] MCJIT and Kaleidoscope Tutorial
Header file:
and the function for a simple foo() function:
And the main function:
Build the shared library using foo.c:
Generate the LLVM bitcode for the main.c file:
And run the LLVM bitcode through jit (and MCJIT) to get the desired output:
You can also pipe the clang output into lli:
Output
Source obtained from
Shared libraries with GCC on Linux
大多数编译器都有前端、某种中间代码/结构和后端。当您使用 C 程序并使用 clang 进行编译,最终得到一个可以运行的非 JIT x86 程序时,您仍然从前端到中间再到后端。 gcc 也是如此,gcc 从前端到中间和后端。 Gccs 的中间部分并不像 LLVM 那样完全开放和可用。
现在,关于 llvm 的一件有趣的事情,你不能用其他人或至少 gcc 做的事情是,你可以将所有源代码模块,将它们编译为 llvms 字节码,将它们合并到一个大字节码文件中,然后优化整个事情,而不是像其他编译器那样对每个文件或每个函数进行优化,使用 llvm,您可以获得任何级别的部分编译程序优化,您喜欢。然后您可以获取该字节码并使用 llc 将其导出到目标汇编器。我通常做嵌入式,所以我有自己的启动代码,我将其包装起来,但理论上你应该能够获取该汇编程序文件并使用 gcc 编译、链接它并运行它。 gcc myfile.s -o myfile.s我想有一种方法可以让 llvm 工具来完成此操作,而不必使用 binutils 或 gcc,但我没有花时间。
我喜欢 llvm,因为它始终是一个交叉编译器,与 gcc 不同,您不必为每个目标编译一个新的编译器并处理每个目标的细微差别。我不知道 JIT 有什么用处,我是说我将它用作交叉编译器和本机编译器。
所以你的第一个案例是前端、中间、后端,这个过程对你来说是隐藏的,你从源代码开始,得到一个二进制文件,就完成了。第二种情况是,如果我正确理解了前面和中间,并停止了一些代表中间的文件。然后中间到末端(特定目标处理器)可以在运行时及时发生。区别在于后端,情况二的中间语言的实时执行,可能与情况一的后端不同。
Most compilers have a front end, some middle code/structure of some sort, and the backend. When you take your C program and use clang and compile such that you end up with a non-JIT x86 program that you can just run, you have still gone from frontend to middle to backend. Same goes for gcc, gcc goes from frontend to a middle thing and a backend. Gccs middle thing is not wide open and usable as is like LLVM's.
Now one thing that is fun/interesting about llvm, that you cannot do with others, or at least gcc, is that you can take all of your source code modules, compile them to llvms bytecode, merge them into one big bytecode file, then optimize the whole thing, instead of per file or per function optimization you get with other compilers, with llvm you can get any level of partial to compilete program optimization you like. then you can take that bytecode and use llc to export it to the targets assembler. I normally do embedded so I have my own startup code that I wrap around that but in theory you should be able to take that assembler file and with gcc compile and link it and run it. gcc myfile.s -o myfile. I imagine there is a way to get the llvm tools to do this and not have to use binutils or gcc, but I have not taken the time.
I like llvm because it is always a cross compiler, unlike gcc you dont have to compile a new one for each target and deal with nuances for each target. I dont know that I have any use for the JIT thing is what I am saying I use it as a cross compiler and as a native compiler.
So your first case is the front, middle, end and the process is hidden from you you start with source and get a binary, done. The second case is if I understand right the front and the middle and stop with some file that represents the middle. Then the middle to end (the specific target processor) can happen just in time at runtime. The difference there is the backend, the real time execution of the middle language of case two, is likely different than the backend of case one.