将 LLVM JIT 代码链接到静态 LLVM 库?
我正在实现一个跨平台(Mac OS X、Windows 和 Linux)应用程序,该应用程序将对财务数据进行大量 CPU 密集型分析。出于速度原因,大部分分析引擎将用 C++ 编写,并带有用户可访问的脚本引擎与 C++ 测试引擎接口。随着时间的推移,我想编写几个脚本前端来模拟具有现有大量用户群的其他流行软件。第一个方面是类似 VisualBasic 的脚本语言。
我认为 LLVM 非常适合我的需求。由于数据量巨大,性能非常重要;运行一次测试可能需要数小时或数天才能得到答案。我相信,随着时间的推移,使用 LLVM 还可以让我使用单一后端解决方案,同时为不同风格的脚本语言实现不同的前端。
测试引擎本身将与界面分离,测试甚至将在单独的过程中进行,并将进度和结果报告给测试管理界面。测试将包括与测试引擎代码集成的脚本代码。
在我编写的类似商业测试系统的先前实现中,我构建了一个快速解释器,它可以轻松地与测试库交互,因为它是用 C++ 编写的并直接链接到测试引擎库。从脚本代码到测试库对象的回调涉及格式之间的转换,开销很大。
我想象通过 LLVM,我可以直接在 C++ 中实现回调,这样我就可以使脚本代码几乎像用 C++ 编写一样工作。同样,如果所有代码都编译为 LLVM 字节码格式,那么 LLVM 优化器似乎可以跨越脚本语言和用 C++ 编写的测试引擎代码之间的界限进行优化。
我不想每次都编译测试引擎。理想情况下,我想仅 JIT 编译脚本代码。对于小型测试,我会跳过一些优化过程,而对于大型测试,我会在链接期间执行全面优化。
那么这可能吗?我可以将测试引擎预编译为 .o 目标文件或 .a 库文件,然后使用 JIT 链接到脚本代码吗?
最后,理想情况下,我希望脚本代码实现特定方法作为特定 C++ 类的子类。因此,C++ 测试引擎只能看到 C++ 对象,而 JIT 设置代码编译的脚本代码实现了对象的某些方法。看起来,如果我使用正确的名称修改算法,那么将脚本语言的 LLVM 生成设置为看起来像 C++ 方法调用(然后可以链接到测试引擎)会相对容易。
因此,链接阶段将向两个方向进行,从脚本语言调用测试引擎对象以检索定价信息和测试状态信息,以及从测试引擎调用某些特定 C++ 对象的方法,其中代码不是由 C++ 提供,而是由 C++ 提供。来自脚本语言。
总之:
1) 我可以链接预编译(.bc、.o 或 .a)文件作为 JIT 编译、代码生成过程的一部分吗?
2) 我可以使用上面 1) 中的过程链接代码,这样我就能够创建就像全部用 C++ 编写的代码一样吗?
I'm in the process of implementing a cross-platform (Mac OS X, Windows, and Linux) application which will do lots of CPU intensive analysis of financial data. The bulk of the analysis engine will be written in C++ for speed reasons, with a user-accessible scripting engine interfacing with the C++ testing engine. I want to write several scripting front-ends over time to emulate other popular software with existing large user bases. The first front will be a VisualBasic-like scripting language.
I'm thinking that LLVM would be perfect for my needs. Performance is very important because of the sheer amount of data; it can take hours or days to run a single run of tests to get an answer. I believe that using LLVM will also allow me to use a single back-end solution while I implement different front-ends for different flavors of the scripting language over time.
The testing engine itself will be separated from the interface and testing will even take place in a separate process with progress and results being reported to the testing management interface. Tests will consist of scripting code integrated with the testing engine code.
In a previous implementation of a similar commercial testing system I wrote, I built a fast interpreter which easily interfaced with the testing library because it was written in C++ and linked directly to the testing engine library. Callbacks from scripting code to testing library objects involved translating between the formats with significant overhead.
I'm imagining that with LLVM, I could implement the callbacks into C++ directly so that I could make the scripting code work almost as if it had been written in C++. Likewise, if all the code was compiled to LLVM byte-code format, it seems like the LLVM optimizers could optimize across the boundaries between the scripting language and the testing engine code that was written in C++.
I don't want to have to compile the testing engine every time. Ideally, I'd like to JIT compile only the scripting code. For small tests, I'd skip some optimization passes, while for large tests, I'd perform full optimizations during the link.
So is this possible? Can I precompile the testing engine to a .o object file or .a library file and then link in the scripting code using the JIT?
Finally, ideally, I'd like to have the scripting code implement specific methods as subclasses for a specific C++ class. So the C++ testing engine would only see C++ objects while the JIT setup code compiled scripting code that implemented some of the methods for the objects. It seems that if I used the right name mangling algorithm it would be relatively easy to set up the LLVM generation for the scripting language to look like a C++ method call which could then be linked into the testing engine.
Thus the linking stage would go in two directions, calls from the scripting language into the testing engine objects to retrieve pricing information and test state information and calls from the testing engine of methods of some particular C++ objects where the code was supplied not from C++ but from the scripting language.
In summary:
1) Can I link in precompiled (either .bc, .o, or .a) files as part of the JIT compilation, code-generation process?
2) Can I link in code using the process in 1) above in such a way that I am able to create code that acts as if it was all written in C++?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
clang
目前对 C++ 的支持有限。clang
has limited support for C++ right now.1) 您可以加载和链接 .bc 文件、.o 文件(如果它们已编译为 .so 存档)应该是可加载的,并且其中的符号应该能够使用。
2) 只要您不想用回调做可怕的事情,您可能只需传递标准 C 函数指针并通过函数指针进行回调。您也可以做某些其他事情,但是在不是 C++ 编译器的情况下尝试定义 C++ 对象或模板或调用成员函数是您不想做的事情。
你必须了解 C++ ABI,你必须了解你的目标平台,你必须了解各种各样的事情,你必须有效地成为一个 C++ 编译器来生成看起来像 C++ 的代码。名称 mangler 是最烦人的部分之一。
1) You can load and link .bc files, .o files if they have been comnpiled to a .so archive should be loadable and the symbols in them should be able to be used.
2) As long as you don't want to do horrible things with the callbacks you can probably just pass standard C function pointers and do callbacks by function pointers. You can do certain other things too, but dealing with trying to define C++ objects or templates or call member functions while not being a C++ compiler is something you want to not do.
you must know the C++ ABI, you must know about the platform you target, you must know all sorts of things, you effectively must be a C++ compiler to generate code that looks like it is C++. The name mangler is one of the most annoying parts.