汇编代码 vs 机器代码 vs 对象代码?
目标代码、机器代码和汇编代码有什么区别?
你能举一个直观的例子来说明它们的区别吗?
What is the difference between object code, machine code and assembly code?
Can you give a visual example of their difference?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
机器代码是可以由CPU直接执行的二进制(1和0)代码。 如果您在文本编辑器中打开机器代码文件,您会看到垃圾,包括不可打印的字符(不,不是那些不可打印的字符;))。
目标代码是尚未链接到完整程序的机器代码的一部分。 它是将构成完整产品的特定库或模块的机器代码。 它还可能包含已完成程序的机器代码中未找到的占位符或偏移量。 链接器将使用这些占位符和偏移量将所有内容连接在一起。
汇编代码是纯文本和(某种程度上)人类可读的源代码,大部分与机器指令有直接的 1:1 模拟。 这是通过使用实际指令、寄存器或其他资源的助记符来完成的。 示例包括用于 CPU 跳转和乘法指令的
JMP
和MULT
。 与机器代码不同,CPU 不理解汇编代码。 您可以使用汇编器或编译器将汇编代码转换为机器代码,尽管我们通常认为编译器与高级编程语言相关联,这些语言是从高级编程语言进一步抽象出来的。 CPU指令。构建完整的程序涉及使用汇编语言或 C++ 等高级语言为程序编写源代码。 源代码被汇编(对于汇编代码)或编译(对于高级语言)为目标代码,并且各个模块链接在一起成为最终程序的机器代码。 对于非常简单的程序,可能不需要链接步骤。 在其他情况下,例如使用 IDE(集成开发环境),链接器和编译器可以一起调用。 在其他情况下,可以使用复杂的make脚本或解决方案文件来告诉环境如何构建最终应用程序。
还有一些表现不同的解释语言。 解释性语言依赖于特殊解释器程序的机器代码。 在基础层面上,解释器解析源代码并立即将命令转换为新的机器代码并执行它们。 现代解释器现在变得更加复杂:一次评估源代码的整个部分,尽可能缓存和优化,以及处理复杂的内存管理任务。
最后一种类型的程序涉及使用运行时环境或虚拟机。 在这种情况下,程序首先被预编译为较低级别的中间语言或字节代码。 然后字节码由虚拟机加载,并及时将其编译为本机代码。 这样做的优点是虚拟机可以利用程序运行时以及针对特定环境的可用优化。 编译器属于开发人员,因此必须生成可以在许多地方运行的相对通用(优化程度较低)的机器代码。 然而,运行时环境或虚拟机位于最终用户的计算机上,因此可以利用该系统提供的所有功能。
Machine code is binary (1's and 0's) code that can be executed directly by the CPU. If you open a machine code file in a text editor you would see garbage, including unprintable characters (no, not those unprintable characters ;) ).
Object code is a portion of machine code not yet linked into a complete program. It's the machine code for one particular library or module that will make up the completed product. It may also contain placeholders or offsets not found in the machine code of a completed program. The linker will use these placeholders and offsets to connect everything together.
Assembly code is plain text and (somewhat) human-readable source code that mostly has a direct 1:1 analog with machine instructions. This is accomplished using mnemonics for the actual instructions, registers, or other resources. Examples include
JMP
andMULT
for the CPU's jump and multiplication instructions. Unlike machine code, the CPU does not understand assembly code. You convert assembly code to machine code with the use of an assembler or a compiler, though we usually think of compilers in association with high-level programming language that are abstracted further from the CPU instructions.Building a complete program involves writing source code for the program in either assembly or a higher-level language like C++. The source code is assembled (for assembly code) or compiled (for higher-level languages) to object code, and individual modules are linked together to become the machine code for the final program. In the case of very simple programs, the linking step may not be needed. In other cases, such as with an IDE (integrated development environment) the linker and compiler may be invoked together. In other cases, a complicated make script or solution file may be used to tell the environment how to build the final application.
There are also interpreted languages that behave differently. Interpreted languages rely on the machine code of a special interpreter program. At the basic level, an interpreter parses the source code and immediately converts the commands to new machine code and executes them. Modern interpreters are now much more complicated: evaluating whole sections of source code at a time, caching and optimizing where possible, and handling complex memory management tasks.
One final type of program involves the use of a runtime environment or virtual machine. In this situation, a program is first pre-compiled to a lower-level intermediate language or byte code. The byte code is then loaded by the virtual machine, which just-in-time compiles it to native code. The advantage here is the virtual machine can take advantage of optimizations available at the time the program runs and for that specific environment. A compiler belongs to the developer, and therefore must produce relatively generic (less-optimized) machine code that could run in many places. The runtime environment or virtual machine, however, is located on the end user's computer and therefore can take advantage of all the features provided by that system.
其他答案很好地描述了差异,但您还要求提供视觉效果。 下面的图表显示了从 C 代码到可执行文件的过程。
The other answers gave a good description of the difference, but you asked for a visual also. Here is a diagram showing the journey from C code to an executable.
汇编代码是机器代码的人类可读表示:
机器代码是纯十六进制代码:
我假设您指的是目标文件中的目标代码。 这是机器代码的一种变体,不同之处在于跳转是某种参数化的,以便链接器可以填充它们。
汇编器用于将汇编代码转换为机器代码(目标代码)
链接器链接多个对象(和库)文件以生成可执行文件。
我曾经用纯十六进制编写过一个汇编程序(没有可用的汇编程序),幸运的是,这可以追溯到古老的(古老的)6502。但我很高兴有奔腾操作码的汇编程序。
Assembly code is a human readable representation of machine code:
Machine code is pure hexadecimal code:
I assume you mean object code as in an object file. This is a variant of machine code, with a difference that the jumps are sort of parameterized such that a linker can fill them in.
An assembler is used to convert assembly code into machine code (object code)
A linker links several object (and library) files to generate an executable.
I have once written an assembler program in pure hex (no assembler available) luckily this was way back on the good old (ancient) 6502. But I'm glad there are assemblers for the pentium opcodes.
8B 5D 32
是机器代码mov ebx, [ebp+32h]
是包含
8B 5D 32
的程序集lmylib.so
> 是目标代码8B 5D 32
is machine codemov ebx, [ebp+32h]
is assemblylmylib.so
containing8B 5D 32
is object code源代码、汇编代码、机器代码、目标代码、字节代码、可执行文件和库文件。
对于大多数人来说,所有这些术语往往非常令人困惑,因为他们认为它们是相互排斥的。 看图了解它们的关系。 下面给出每个术语的描述。
源代码
以人类可读(编程)语言编写的指令
高级代码
以高级(编程)语言编写的指令
例如,C、C++ 和 Java 程序
汇编代码
用汇编语言(一种低级编程语言)编写的指令。
作为编译过程的第一步,高级代码被转换成这种形式。 它是汇编代码,然后被转换为实际的机器代码。 在大多数系统上,这两个步骤作为编译过程的一部分自动执行。
例如,program.asm
目标代码
编译过程的产物。 它可以是机器代码或字节代码的形式。
例如,file.o
机器代码
机器语言指令。
例如,a.out
字节码
中间形式的指令,可以由 JVM 等解释器执行。
例如,Java类文件
可执行文件
链接过程的产物。 它们是可以由CPU直接执行的机器代码。
例如,.exe 文件。
请注意,在某些情况下,包含字节码或脚本语言指令的文件也可能被视为可执行。
库文件
一些代码出于不同的原因(例如可重用性)被编译成这种形式,并随后由可执行文件使用。
Source code, Assembly code, Machine code, Object code, Byte code, Executable file and Library file.
All these terms are often very confusing for most people for the fact that they think they are mutually exclusive. See the diagram to understand their relations. The description of each term is given below.
Source code
Instructions in human readable (programming) language
High-level code
Instructions written in a high level (programming) language
e.g., C, C++ and Java programs
Assembly code
Instructions written in an assembly language (kind of low-level programming language).
As the first step of the compilation process, high-level code is converted into this form. It is the assembly code which is then being converted into actual machine code. On most systems, these two steps are performed automatically as a part of the compilation process.
e.g., program.asm
Object code
The product of a compilation process. It may be in the form of machine code or byte code.
e.g., file.o
Machine code
Instructions in machine language.
e.g., a.out
Byte code
Instruction in an intermediate form which can be executed by an interpreter such as JVM.
e.g., Java class file
Executable file
The product of linking proccess. They are machine code which can be directly executed by the CPU.
e.g., an .exe file.
Note that in some contexts a file containing byte-code or scripting language instructions may also be considered executable.
Library file
Some code is compiled into this form for different reasons such as re-usability and later used by executable files.
尚未提及的一点是,存在几种不同类型的汇编代码。 在最基本的形式中,指令中使用的所有数字都必须指定为常量。 例如:
如果将上述代码存储在 Atari 2600 盒式磁带中的地址 $1900 处,则将从地址 $1437 开始的表中获取的多条不同颜色的行显示出来。 在某些工具上,输入地址以及上面一行的最右边部分,会将中间列中显示的值存储到内存中,并以以下地址开始下一行。 以这种形式键入代码比以十六进制键入要方便得多,但必须知道所有内容的精确地址。
大多数汇编器允许使用符号地址。 上面的代码可以写得更像:
汇编器将自动调整 LDA 指令,以便它引用映射到标签 ColorTbl 的任何地址。 与必须手动键入和手动维护所有地址相比,使用这种风格的汇编器可以更轻松地编写和编辑代码。
One point not yet mentioned is that there are a few different types of assembly code. In the most basic form, all numbers used in instructions must be specified as constants. For example:
The above bit of code, if stored at address $1900 in an Atari 2600 cartridge, will display a number of lines in different colors fetched from a table which starts at address $1437. On some tools, typing in an address, along with the rightmost part of the line above, would store to memory the values shown in the middle column, and start the next line with the following address. Typing code in that form was much more convenient than typing in hex, but one had to know the precise addresses of everything.
Most assemblers allow one to use symbolic addresses. The above code would be written more like:
The assembler would automatically adjust the LDA instruction so it would refer to whatever address was mapped to the label ColorTbl. Using this style of assembler makes it much easier to write and edit code than would be possible if one had to hand-key and hand-maintain all addresses.
汇编是人类可以理解的简短描述性术语,可以直接翻译成 CPU 实际使用的机器代码。
虽然汇编器在某种程度上可以被人类理解,但它仍然是低水平的。 需要大量代码才能完成任何有用的事情。
因此,我们使用更高级的语言,例如 C、BASIC、FORTAN(好吧,我知道我已经和自己约会了)。 编译时,它们会生成目标代码。 早期语言以机器语言作为目标代码。
如今,许多语言(例如 JAVA 和 C#)通常会编译成字节码,该字节码不是机器代码,而是一种在运行时很容易被解释以生成机器代码的字节码。
Assembly is short descriptive terms humans can understand that can be directly translated into the machine code that a CPU actually uses.
While somewhat understandable by humans, Assembler is still low level. It takes a lot of code to do anything useful.
So instead we use higher level languages such as C, BASIC, FORTAN (OK I know I've dated myself). When compiled these produce object code. Early languages had machine language as their object code.
Many languages today such a JAVA and C# usually compile into a bytecode that is not machine code, but one that easily be interpreted at run time to produce machine code.
此处讨论了汇编代码。
“汇编语言是一种用于计算机编程的低级语言。它实现了对特定 CPU 架构进行编程所需的数字机器代码和其他常量的符号表示。”
此处讨论了机器代码。
“机器代码或机器语言是由计算机中央处理单元直接执行的指令和数据系统。”
基本上,汇编代码是一种语言,由汇编程序(类似于编译器)翻译为目标代码(CPU 运行的本机代码)。
Assembly code is discussed here.
"An assembly language is a low-level language for programming computers. It implements a symbolic representation of the numeric machine codes and other constants needed to program a particular CPU architecture."
Machine code is discussed here.
"Machine code or machine language is a system of instructions and data executed directly by a computer's central processing unit."
Basically, assembler code is the language and it is translated to object code (the native code that the CPU runs) by an assembler (analogous to a compiler).
我认为这些是代码的主要区别
可读性可以使代码在创建后 6 个月内得到改进或替换,而无需付出太多努力,另一方面,如果性能至关重要,您可能需要使用一种低级语言,针对您在生产中将拥有的特定硬件,以便获得更快的执行速度。
在我看来,今天的计算机速度足够快,足以让程序员通过 OOP 获得快速执行。
I think these are the main differences
Readability can make the code improved or substituted 6 months after it was created with litte effort, on the other hand, if performance is critical you may want to use a low level language to target the specific hardware you will have in production, so to get faster execution.
IMO today computers are fast enough to let a programmer gain fast execution with OOP.
程序的源文件被编译为目标文件,然后链接器将这些目标文件链接在一起,生成一个包含体系结构的机器代码的可执行文件。
当用文本编辑器打开时,目标文件和可执行文件都涉及体系结构的机器代码,以可打印和不可打印字符的形式存在。
尽管如此,文件之间的二分法是目标文件可能包含未解析的外部引用(例如
printf
)。 因此,它可能需要链接到其他目标文件。也就是说,需要解析未解析的外部引用,以便通过与其他目标文件(例如 C/C++ 运行时库)链接来获得合适的可运行可执行文件。 。The source files of your programs are compiled into object files, and then the linker links those object files together, producing an executable file including your architecture's machine codes.
Both object file and executable file involves architecture's machine code in the form of printable and non-printable characters when it's opened by a text editor.
Nonetheless, the dichotomy between the files is that the object file(s) may contain unresolved external references (such as
printf
, for instance). So, it may need to be linked against other object files.. That is to say, the unresolved external references are needed to be resolved in order to get the decent runnable executable file by linking with other object files such as C/C++ runtime library's.