如何理解可执行文件中的字节以及与程序加载到内存中的关系?
以下内容接近我的问题,但我仍然缺少链接,这将帮助我理解加载过程。
我的问题是“当我在命令行中输入 mf.com 时,机器会一步步发生什么?”
我使用的是Windows 7,并且我已经安装了NASM进行编译。 以下是我在其中一个网站获得的程序集
文件名是 mf.asm
org 100h
mov dx, msg mov ah,9
int 21h ret
msg db "Hello, world !$"
我使用以下命令获取我的 mf.com 文件
nasm -f bin mf.asm -o mf.com
现在,我通过输入运行 mf.com
mf.com
我得到结果 你好,世界!$
我在文本板中打开了 mf.com 二进制文件,其显示如下。
0: BA 08 01 B4 09 CD 21 C3 48 65 6C 6C 6F 2C 20 77 ********Hello, w
10: 6F 72 6C 64 20 21 24 orld !$
******** 是文本编辑器中显示的各个字符。
当我在命令行中输入 mf.com 并按 Enter 时,机器会一步步发生什么? 特别是“BA 08 01 B4 09 CD 21 C3”这8个字节将如何使用?
The following is close to my question, still I have a missing link, that would help me understand the loading process.
How does DOS load a program into memory?
My question is "what will happen in machine step by step when I type mf.com in commandline?"
I am using windows 7 and I have installed NASM for compiling.
The following is the assembly that I got in one of the website
The filename is mf.asm
org 100h
mov dx, msg
mov ah, 9
int 21h
ret
msg db "Hello, world !$"
I used the following command to get My mf.com file
nasm -f bin mf.asm -o mf.com
Now, I run the mf.com by typing
mf.com
I get the result
Hello, World !$
I opened the mf.com binary in textpad and its shown like this.
0: BA 08 01 B4 09 CD 21 C3 48 65 6C 6C 6F 2C 20 77 ********Hello, w
10: 6F 72 6C 64 20 21 24 orld !$
The ******** were the respecive characters tha was showed in the text editors.
what will happen in machine step by step when I type mf.com in commandline and hit enter?
especially "BA 08 01 B4 09 CD 21 C3" how would this 8 bytes be used?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这些是代表指令本身的字节、前缀字节(我认为最多 4 个)、主操作码(1 或 2)、可选的 MODRM 和 SIB 字节、位移字节,最后是立即操作数(如果有)。机器根据指令的前缀和主要操作码来解释这些字节。如果你真的想知道,你可以找到一些表格来显示这些内容的二进制形式。
处理器不会将指令解释为助记符,助记符只是它们的,因此您可以更轻松地编写代码。这些助记符被编译器更改为计算机可以理解的东西,即字节码或原始二进制数据。从那时起,硬件就接管了一切。
Those are the bytes that represent the instruction's themselves, Prefix bytes (up to 4 I believe), primary opcode (1 or 2), optional MODRM and SIB bytes, displacement bytes, and finally the immediate operands (if any). The machine interprets those bytes depending on the prefix and the primary opcode of the instruction. If you really want to find out you can find tables that show you what those are in binary.
Processors don't interpret the instructions as mnemonics, the mnemonics are only their so it's easier for you to write the code. These mnemonics are changed by the compiler into something the computer can understand which is bytecode or raw binary data. The hardware takes over from that point.
字符
BA 08 01 B4 09 CD 21 C3 48 65 6C 6C 6F 2C 20 77
是机器代码。它们将被翻译成汇编指令。在您的程序中,它们是以下代码的翻译:因此,简而言之,这 8 个字节导致
MOV
和INT
指令在您的处理器中执行。 MOV 指令复制 DX 寄存器中包含字符串“Hello World !$”的内存位置的地址。The characters
BA 08 01 B4 09 CD 21 C3 48 65 6C 6C 6F 2C 20 77
are machine code. They would be translated into assembly instructions. From your program, they are the translation of the following code:So, in a nutshell those 8 bytes cause
MOV
andINT
instructions to be executed in your processor. The MOV instructions copies the address of memory location in DX register which contains the string "Hello World !$".