汇编程序如何在硬件级别工作?
我一直在网上阅读有关汇编器工作的信息,但它非常令人困惑。总结到目前为止我所理解的是,汇编器基本上是一个文本解析器,可以访问查找表以将汇编语言指令映射到等效的二进制指令。我说得对吗?如果是的话,这个查找表存在于 CPU 的物理硬件中的什么位置。
I have been reading online about the working of an Assembler but it is quite confusing. To summarize what I have understood so far is the an Assembler is basically a text parser with access to a Look up table to map the Assembly language instructions to the equivalent binary instructions. Am I correct ? If I am, where does this look up table exist in the physical hardware of a CPU.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
CPU 执行机器代码(一堆数字)。这些数字通常以某种方式排列(以便 CPU 更容易解码),涉及整体格式的各种规则,哪些部分确定操作码,哪些部分确定指令的操作数等。
汇编语言大部分(不完全是) - 见下文)机器代码的“纯文本”表示。 CPU 使用的所有规则(对于机器代码)都会成为汇编器使用的规则。例如,如果描述某些指令的机器代码的 CPU 文档说“位 4 到 7 确定哪个寄存器用于第一个操作数”,那么汇编器可能有一个函数(或者可能是一个表)将寄存器名称转换为位 4 到 7 的正确值。指令组(将指令的助记符转换为任意数量的操作码字节的函数或表)也会发生类似的情况。
所有用于将文本转换为机器代码(函数、表等)的内容都是由编写汇编程序的人创建的(以遵守 CPU 文档中关于如何将所有内容编码为机器代码的说明)。这些都不是来自 CPU 本身;而是来自 CPU 本身。大多数汇编器将在完全不同的 CPU 上运行(例如,大多数 80x86 汇编器可以轻松移植到 ARM 或 PowerPC 或 MIP 等上运行)。
除此之外,汇编器还必须提供有用的错误检查和报告(这样,如果汇编语言源代码中存在错误,程序员很容易找出问题所在 - 例如,使用漂亮的/描述性的错误消息行号等);加上对预处理的支持(宏等);加上对各种输出文件格式的支持(适合不同链接器的目标文件、原始输出文件类型,如“平面二进制”等);和指令(控制预期的 CPU 模式、对齐等)以及程序员描述“非代码数据”的方式。
所有这些其他东西也是由编写汇编器的人创建的。
The CPU executes machine code (a bunch of numbers). Those numbers are typically arranged in a certain way that (to make it easier for CPU to decode) involve various rules for the overall format, which pieces determine the opcode, which pieces determine the instruction's operands, etc.
Assembly language is mostly (not completely - see later) a "plain text" representation of machine code. All of the rules the CPU uses (for machine code) become rules used by the assembler. For example, if the documentation for the CPU that describes machine code for some instructions says "bits 4 to 7 determine which register is used for 1st operand" then the assembler might have a function (or maybe a table) to convert register names into the right values for bits 4 to 7. Similar happens for instruction groups (a function or table to convert the instruction's mnemonic into however many opcode bytes).
All of the stuff used to convert text into machine code (functions, tables, etc) are created by whoever wrote the assembler (to comply with the CPU's documentation for how everything is encoded into machine code). None of this comes from the CPU itself; and most assemblers will run on a completely different CPU (e.g. most 80x86 assemblers can easily be ported to run on ARM or PowerPC or MIPs or..).
On top of this the assembler also has to provide useful error checking and reporting (so that if there's a mistake in the assembly language source code it's easy for a programmer to figure out what is wrong where - e.g. using a nice/descriptive error message with a line number, etc); plus support for preprocessing (macros, etc); plus support for various output file formats (object files to suit different linkers, raw output file types like "flat binary", etc); and directives (to control intended CPU mode, alignment, etc) and a way for a programmer to describe "data that is not code".
All of this other stuff is also created by whoever wrote the assembler.
首先,有一个 noce noreferrer“> instraction set settection(isa) - 这是作为规范出版的文本人类消费,通常由CPU供应商。 本文档指定了可用于使用程序的每个机器代码指令,以及可以实现处理器。  ISA规范涉及软件和硬件之间的基本边界。符合软件程序员和硬件实施者之间的基本协议(或思想达成)。
为了方便起见,ISA规范还可以为每个机器代码指令提供“首选”或建议的装配表格。
汇编器是使用ISA规范来告知汇编代码转换为机器代码的人们编写的程序。 他们用来完成翻译的机制包含在汇编器的程序代码中,并且可能涉及具有模式匹配的表,或者可以使用普通编程(例如,如果是if-then语句)进行,所有这些都由ISA规范提供了信息。 没有一种正确设计汇编器的方法。
翻译(组装)完全由汇编程序程序控制(无需咨询硬件) - 例如,我们可以在Windows X64上运行一个汇编器,该汇编器接受ARM Linux的代码并生成代码 - 引用了两个非常不同的处理器:一个实际上正在运行汇编程序,另一个是组装机器代码的预期目标。 因此,运行汇编程序的处理器与生成的计算机代码之间没有直接的关系。
对于同一ISA。特定汇编程序的作者将发布其汇编语言的规范,该规范显示了如何使用其汇编Mnemonics和其他语法的版本来指定和完成ISA的机器代码指令(例如寻址模式,标签等)。
硬件还由使用此ISA规范来实现机器代码指令及其所有变体的人撰写。 可能有表格,可能有微码(有些人可能认为这是描述完成指令的动作的查找“表”)。 与汇编器一样,有许多可能的方法,没有一种正确的方法来实现指令集。
因此,对于软件和硬件的基础是指令集体系结构中的一致性。 软件程序员接受硬件将实施此规范,硬件程序员接受该软件将使用此规范。
First, there is an Instruction Set Architecture (ISA) — this is a specification published as text for human consumption, usually by a CPU vendor. This document specifies each and every machine code instruction that is available for programs to use and for processors to implement. An ISA specification goes to the fundamental boundary between software and hardware; to the fundamental agreement (or meeting of the minds) between software programmers and hardware implementers.
As a convenience, the ISA specification may also include a "preferred" or suggested assembly form for each machine code instruction.
An assembler is a program written by people who are using an ISA specification to inform the translation of assembly code into machine code. The mechanism they use to accomplish translation is contained within the program code of the assembler, and may involve a table with pattern matching, or may be done using ordinary programming (e.g. if-then statements), all informed by the ISA specification. There's no one right way to design an assembler.
The translation (assembling) is entirely under control of the assembler program (without consultation of hardware) — consider, for example, that we can run an assembler on Windows x64 that accepts and generates code for ARM Linux — two very different processors are referenced: one is actually running the assembler program, and the other is the intended target of the assembled machine code. So, there is no direct relationship between the processor running the assembler and the generated machine code.
There can be many assemblers for the same ISA. The authors of a particular assembler will publish a specification for their assembly language, which shows how to specify and accomplish the ISA's machine code instructions using their versions of assembly mnemonics and other syntax (like for addressing modes, labels, etc..).
The hardware is also written by people who are using this ISA specification to implement the machine code instructions and all their variations. There may be tables, there may be microcode (which some might consider as lookup "tables" describing the actions to accomplish an instruction). As with the assembler, there are many possible approaches and no one right way to implement an instruction set.
Thus, fundamental to both software and hardware is agreement in Instruction Set Architecture. Software programmers accept that hardware will implement this specification, and hardware programmers accept that software will use this specification.