从 OCaml 调用 C/汇编函数比使用 caml_c_call 更直接
OCaml 允许从 OCaml 程序调用 C 函数,只要程序员遵循手册“C 与 OCaml 的接口”一章中的说明即可。
当遵循这些指令时,对 C 函数的调用会被本机编译器翻译为:(
movq ml_as_z_sub@GOTPCREL(%rip), %rax
call caml_c_call@PLT
此处为 amd64 指令集,但看看其他体系结构,该方案似乎相当统一)。
函数caml_c_call
最终会执行一次计算跳转call *%rax
,但它之前和之后做了很多事情。来自asmrun/amd64.S:
/* Call a C function from Caml */
FUNCTION(G(caml_c_call))
.Lcaml_c_call:
/* Record lowest stack address and return address */
popq %r12
STORE_VAR(%r12, caml_last_return_address)
STORE_VAR(%rsp, caml_bottom_of_stack)
/* Make the exception handler and alloc ptr available to the C code */
STORE_VAR(%r15, caml_young_ptr)
STORE_VAR(%r14, caml_exception_pointer)
/* Call the function (address in %rax) */
call *%rax
/* Reload alloc ptr */
LOAD_VAR(caml_young_ptr, %r15)
/* Return to caller */
pushq %r12
ret
当一个人想要频繁执行几条既不分配也不引发异常的指令时,上面的内容有点矫枉过正。
有人有直接从 OCaml 调用小型汇编例程而不通过 caml_c_call 存根的经验吗?这可能涉及欺骗本机编译器,使其认为它正在调用 ML 函数,或修改编译器。
问题是在 Zarith 库的上下文中,其中少量的汇编代码可以直接计算并返回大多数结果,而无需经过 caml_c_call
,并且只需跳转到 caml_c_code
> 对于需要分配或例外的困难参数。有关示例,请参阅此文件可以直接执行的汇编位。
OCaml allows C functions to be called from OCaml programs, as long as the programmer follows the instructions in the "Interfacing C with OCaml" chapter of the manual.
When following these instructions, a call to a C function is translated by the native compiler to:
movq ml_as_z_sub@GOTPCREL(%rip), %rax
call caml_c_call@PLT
(amd64 instruction set here, but looking at other architectures, the scheme seems to be rather uniform).
The function caml_c_call
eventually does a computed jump call *%rax
, but it does a lot of things before and after. From asmrun/amd64.S:
/* Call a C function from Caml */
FUNCTION(G(caml_c_call))
.Lcaml_c_call:
/* Record lowest stack address and return address */
popq %r12
STORE_VAR(%r12, caml_last_return_address)
STORE_VAR(%rsp, caml_bottom_of_stack)
/* Make the exception handler and alloc ptr available to the C code */
STORE_VAR(%r15, caml_young_ptr)
STORE_VAR(%r14, caml_exception_pointer)
/* Call the function (address in %rax) */
call *%rax
/* Reload alloc ptr */
LOAD_VAR(caml_young_ptr, %r15)
/* Return to caller */
pushq %r12
ret
When one wants to frequently execute a couple of instructions that neither allocate nor raise exceptions, the above is a little bit overkill.
Does anyone have any experience in calling a small assembly routine directly from OCaml, without going through the caml_c_call
stub? This probably involves tricking the native compiler into thinking that it is calling an ML function, or modifying the compiler.
The question is in the context of the library Zarith, where small assembly bits of code could compute and return most results directly, without having to go through caml_c_call
, and only jump to caml_c_code
for the difficult arguments that require allocation or exceptions. See this file for examples of assembly bits that could be executed directly.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
也许"noalloc"和"float “能有点用处吗?
PS 更多相关链接。
Maybe "noalloc" and "float" could be of some use?
PS some more related links.
如果您调用的函数可以用汇编语言编写,那么您似乎不会介意 OCaml 函数调用的开销。我刚刚做了一些实验,您可以通过我上面概述的方法来做到这一点。
这就是我所做的。为了获得可行的汇编语言模板,我在 OCaml 中定义了一个简单的函数,并使用 -S 标志进行编译。
注意:您需要指定
-inline 0
以确保 ocamlopt 从生成的 .o 文件中获取代码,而不是从 .cmx 文件中的内联定义中获取代码。现在您有一个名为 sep.s 的文件。
addto
函数看起来像这样(非常好代码,实际上):
只是为了测试,我将 2(代表 OCaml 中的 1)更改为 4(代表 OCaml 中的 2)。现在您已经:
现在组装此文件,生成 sep.o 的异常版本。
本质上,您已经欺骗 ocamlopt 将 sep.o 中的代码视为在 OCaml 中编码。但是您可以自己用汇编语言编写代码(如果您小心不要违反任何体系结构假设)。
您可以将其链接到主程序并运行它:
如您所见,它运行修改后的汇编代码。
您可以按照此过程在中创建任何 OCaml 可调用函数
汇编代码。只要您不介意 OCaml 函数调用的开销,这种方法就可能满足您的需求。
我不知道这种技巧将如何影响调试和垃圾收集的处理,因此我不会使用执行任何分配的函数来尝试此操作。
这些测试使用 OCaml 3.12.0(标准 64 位版本)在 Mac OS X 10.6.8 上运行。当我运行“as”时,我正在运行 Xcode 4.0.2 中的库存 OS X 汇编器,默认情况下使用 x86_64 架构。
It sounds like you wouldn't mind the overhead of an OCaml function call if the function you were calling could be written in assembly. I just did some experimentation, and you can do this by the method I outlined above.
Here's what I did. To get a workable assembly language template, I defined a simple function in OCaml and compiled with the -S flag.
Note: you need to specify
-inline 0
to assure that ocamlopt takes the code from your generated .o file and not from the inline definition in the .cmx file.Now you have a file named sep.s. The
addto
function looks like this (amazingly goodcode, actually):
Just for a test, I changed the 2 (which represents 1 in OCaml) to 4 (which represents 2 in OCaml). So you now have:
Now assemble this file, producing a deviant version of sep.o.
In essence, you have tricked ocamlopt into treating the code in sep.o as if it was coded in OCaml. But you can write the code yourself in assembly (if you're careful not to violate any of the architectural assumptions).
You can link it into a main program and run it:
As you can see, it runs the modified assembly code.
You could follow this procedure to create any OCaml-callable functions in
assembly code. As long as you don't mind the overhead of an OCaml function call, this approach might do what you want.
I don't know how this trickery will affect the handling of debugging and garbage collection, so I wouldn't try this with a function that does any allocations.
These tests were run on Mac OS X 10.6.8 using OCaml 3.12.0 (the stock 64-bit build). When I run "as", I'm running the stock OS X assembler from Xcode 4.0.2, which uses x86_64 architecture by default.
在我看来,欺骗编译器认为它正在调用 OCaml 函数并没有帮助,除非你也欺骗它内联调用。据我通过仔细阅读源代码了解到,内联函数是用称为 Ulambda 代码的代码表示的,而 Ulambda 代码又包含原语。因此,无论如何,这种思路都会导致为 Zarith 操作添加原语。如果您这样做,您就有了一个很好的(一点也不棘手)解决方案,但它可能比您想要做的工作更多。
对于一个非常棘手的方法,您可以尝试对生成的 asm 代码进行后处理以删除函数调用并用内联代码替换它们。这种伎俩已经用过很多次了。它通常不会持续很长时间,但对于短期来说可能就足够了。
为此,您只需为 OCaml 编译器提供要运行的不同汇编器的名称,该汇编器会在汇编之前进行修改。
It seems to me it doesn't help to trick the compiler into thinking it's calling an OCaml function, unless you also trick it into inlining the call. As far as I can tell by perusing sources, inlined functions are expressed in something called Ulambda code, which in turn contains primitives. So this line of thinking, anyway, leads to adding primitives for your Zarith operations. If you do that, you have a good (not at all tricky) solution, but it might be more work than you want to do.
For a really tricky approach, you could try post-processing the generated asm code to remove function calls and replace them with in-line code. This kind of trick has been used many times. It usually doesn't hold up for long, but it might be good enough for the short term.
To do this, you'd just give the OCaml compiler the name of a different assembler to run, one that does your modifications before assembling.