二进制翻译 |交叉编译

发布于 2024-10-22 01:34:11 字数 507 浏览 3 评论 0原文

假设您正在为不同的体系结构编写编译器。 这些架构具有不同的字节序。 您有内存读取和写入指令

以存储指令为例,您要在其中存储值 0xAA0xBB0xCC0xDD。 现在,在为此编写程序集时,您是否为以下内容编写了两条不同的指令: 不同的架构 例如

对于小端:st (reg), 0xDD0xCC0xBB0xAA

对于大端:st (reg), 0xAA0xBB0xCC0xDD

或者你编写相同的指令,例如,st, (reg), 0xAA0xBB0xCC0xDD 对于两种架构,并让处理器解析指令,以便处理系统的字节顺序?

我问这个问题的原因是我不知道二进制翻译器在必须在不同字节序的体系结构之间翻译代码时会做什么。如果在架构 A 中,您看到以下行 st, (reg), XY 您是否将其转换为架构 B 的 st, (reg), YX ?如果是这样的话,那么内存读取会发生什么?

我想知道如何处理字节序,考虑二进制翻译中的内存读写。

Say you are writing compilers for different architectures.
The architectures have different endianness.
You have memory read and write instructions

Take example of a store instruction, where you want to store the value 0xAA0xBB0xCC0xDD.
Now while writing the assembly for this, do you write two different instructions for the
different architectures e.g.

For the little endian: st (reg), 0xDD0xCC0xBB0xAA

For the big endian: st (reg), 0xAA0xBB0xCC0xDD

Or you write the same instruction, say, st, (reg), 0xAA0xBB0xCC0xDD for both the architectures and let the instruction be parsed by the processor such that it takes care of the endianness of the system?

The reason why I ask this question is I don't know what a binary translator would do when it has to translate code between architectures of different endianness. If in Architecture A, you see the following line st, (reg), XY do you convert it into st, (reg), YX for the Architecture B ?? If that is the case, then what happens to memory reads?

I would like to know how to take care of endianness, considering memory reads and writes in binary translation.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

梦晓ヶ微光ヅ倾城 2024-10-29 01:34:11

Endianess 与内存的读取或写入方式无关,而只是意味着当内存被解释为数字时,最重要的字节是第一个还是最后一个。造成差异的只是算术的实现。

所以你的二进制翻译器,即使存在这样的东西,也不会改变任何东西,它只是像 ADD、SUB 和 MUL 这样的指令,它们对数字的解释不同。

Endianess has nothing to do with how memory is read or written, but instead it just means when memory is interpreted as a number, is the most significant byte first or last. It is only the implementation of the arithmetic which makes the difference.

So your binary translator, if such a thing even exist, won't change anything, it is just instructions like ADD, SUB and MUL which interpret numbers differently.

私藏温柔 2024-10-29 01:34:11

我不确定我完全理解你的问题,但听起来你想翻译一些汇编语言代码或反汇编的二进制文件?

我使用过的每个汇编器都以合理的方式处理常量的字节顺序。也就是说,如果你想存储 0xAABBCCDD,你可以这样写:

st (reg), 0xAABBCCDD

如有必要,汇编器将混合常量以获得适当的操作码。当您想要使用一个操作存储多个单字节值时,字节序就会成为一个问题。就像使用相同的操作码将一个短的以 null 结尾的字符串 "123" 写入内存一样。您必须在汇编代码中混合该常量,以使其按照小端与大端系统的正确顺序输出到内存:

st (reg), 0x31323300 // big-endian
st (reg), 0x00333231 // little-endian

安全的方法是仅按照您想要的顺序存储字节:

stb (reg+0), 0x31
stb (reg+1), 0x32
stb (reg+2), 0x33
stb (reg+3), 0x00

但这需要四个字节相反,请使用指示。

I'm not sure I understand your question fully, but it sounds like you want to translate some assembly-language code or a disassembled binary?

Every assembler I've ever worked with handles the endianness of constants in the sane way. That is to say, if you want to store 0xAABBCCDD, you would write:

st (reg), 0xAABBCCDD

And the assembler will swizzle the contstant if necessary for the appropriate opcode. Where endianness becomes a concern is where you want to store multiple single-byte values using that one operation. Something like writing a short null-terminated string "123" to memory using the same opcode. You have to swizzle that constant in your assembly code to get it output to memory in the right order for little- vs. big-endian systems:

st (reg), 0x31323300 // big-endian
st (reg), 0x00333231 // little-endian

The safe way is to just store the bytes in the order you want them:

stb (reg+0), 0x31
stb (reg+1), 0x32
stb (reg+2), 0x33
stb (reg+3), 0x00

But that takes four instructions, instead.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文