如何在 x86 上最好地传递我的语言的全局偏移表 (GOT)?
我正在为我的语言编写一个小型程序加载器,因为我放弃了理解 ELF 格式(在这样做的同时,我最终可能会更好地理解它)。 我将文件映射到内存上,tux 很高兴。
我不想通过对程序进行任何更改来阻碍程序的共享。 因此我最终做了与 C 和 elf 相同的事情:全局偏移表。
问题是:我的程序如何通过GOT?
首先想到的是将它放在寄存器或堆栈参数中。 在寄存器中这会很棒,但 x86 的寄存器数量会延迟。 这可能意味着我将失去 ebx 或 ebp 等。 在一个合理的架构中,这是一个公平的权衡。 在 x86 中感觉有点失败。
共享库的反汇编表明 gcc 正在将其作为 IP 相对寻址来执行。 如果我这样做,会是:
call 0
here:
pop eax
; do something with [eax + (got - here) + index*4]
不过,部分情况下这感觉很复杂。 我不喜欢这样做。
还有更多想法吗?
编辑:当使用多个库处理这个问题时,我意识到:每个应用程序将有多个 GOT,并且某些 GOT 的使用取决于我所在的代码块。因此,将 GOT 保留在单独的寄存器将需要一些我不知道的额外技巧。 我想知道他们在寄存器中保存 GOT 时如何解决这个问题。
I'm writing a small program loader for my language because I gave up on understanding ELF format (and while doing this, I may eventually understand it better). I mmap the files on the memory and tux rejoices whatever..
I don't want to hinder the sharing of the program by doing any changes on it. Therefore I end up doing the same as C and elf does: global offset table.
The problem is: how can I pass the GOT for my program?
First thing that comes to mind is giving it along in a register or stack argument. In a register it'd be great, but x86 is retarded by it's register count. This could mean I will lose ebx or ebp or some such. In a sensible architecture this'd be a fair tradeoff. In x86 it feels a bit of fail.
Disassembly of a shared library shows me that gcc is doing it as an IP-relative addressing. If I'd do this, it'd be:
call 0
here:
pop eax
; do something with [eax + (got - here) + index*4]
Though, partially this feels complicated. I don't like about doing this.
Any more ideas, anyone?
Edit: When getting to handle this with multiple libraries, I realised this: I will have multiple GOTs per app and the use of certain GOT depends on which chunk of code I am in. Therefore keeping GOT in a separate register is going to require some additional tricks I'm not aware of. I'd like to know how they solve this problem when keeping GOT in registers.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用段寄存器之一(或其基址)作为二进制映像的基址。 所以你会参考你的全球数据,例如。 如 FS:xxx。
这些寄存器是所谓的分段内存模型的残余。 基本上,段是具有指定基址(和限制)的线性地址空间的“窗口”,如果使用它们进行寻址(例如,如果地址是 0010:00000001),则结果地址是(具有选择器 0010 的段的基址) )+00000001。 段的基址(以及其他参数)存储在描述符表(有更多)中,描述符表是内存中的一个特殊区域。 这些只能在内核模式下修改,Linux 中有系统调用可以执行此操作(
modify_ldt
,arch_prctl
)。 在64位模式下,情况稍微复杂一些。有关参考,请参阅 AMD64 架构手册,尤其是第 2 卷:系统编程。
You can use one of the segment registers (or the base thereof) for the base of your binary image. So you would refer to your global data eg. as FS:xxx.
These registers are remnants of the so called segmented memory model. Basically, segments are "windows" into the linear address space with specified base (and limit), and if you use them for addressing, (eg. if address is 0010:00000001) the resulting address is the (base of segment with selector 0010)+00000001. The base (as well as other parameters) of the segment are stored in the descriptor table (there are more of these) which is a special area in memory. These can be only modified in kernel mode, there are syscalls in linux that do this (
modify_ldt
,arch_prctl
). In 64-bit mode, the situation is a little more complicated.For a reference, see the AMD64 architecture manual, especially Volume 2: System Programming.