该程序集访问该字符串常量有什么问题?
我以为我开始明白发生了什么,但我现在花了很长时间试图理解为什么以下不起作用:
org 0x7C00
mov ax,0x0000
mov ds,ax
mov si, HelloWorld
HelloWorld db 'Hello World',13,10,0
我期望的是 mov si, HelloWorld
指令会将值 0x7C08
放入 si
(即 0x7c00
+ HelloWorld
的偏移量),为类似的事情做好准备lodsb
。
当我构建这个(使用 Nasm)并运行它(使用 Bochs)时,我发现最终指令实际上看起来像这样:
mov si, 0x8400
为什么会这样,值 0x8400
来自哪里?
更新:我发现将 HelloWorld
放在数据段中会产生预期的输出:
section .data
HelloWorld db 'Hello World',13,10,0
这是为什么?
仅供参考,用于构建此命令的命令是 nasm -f bin input.asm -o output.bin
Update 2 我已经注意到 0x8400
是0x7c00 + 0x0800
,其中 8 是 HelloWorld
距输出开头的偏移量 - 当我在使用 org 时发现这一点时,我注意到了这一点0
使用的地址是0x0800
。
我仍然不明白发生了什么事——发现这一点让我更加困惑!
根据要求,使用 ndisasm
进行反汇编:
00000000 B80000 mov ax,0x0
00000003 8ED8 mov ds,ax
00000005 BE0084 mov si,0x8400
00000008 48 dec ax
00000009 656C gs insb
0000000B 6C insb
0000000C 6F outsw
0000000D 20576F and [bx+0x6f],dl
00000010 726C jc 0x7e
00000012 640D0A00 fs or ax,0xa
I thought I was starting to understand whats going on, but I've been spending ages now trying to understand why the following doesn't work:
org 0x7C00
mov ax,0x0000
mov ds,ax
mov si, HelloWorld
HelloWorld db 'Hello World',13,10,0
What I'm expecting is that the mov si, HelloWorld
instruction will place the value 0x7C08
in si
(which is 0x7c00
+ the offset of HelloWorld
), ready for things like lodsb
.
When I build this (using Nasm) and run it (using Bochs) I find up that the end instruction actually looks like this:
mov si, 0x8400
Why is this, and where has the value 0x8400
come from?
Update: I've discovered that placing HelloWorld
in the data segment produces the expected output:
section .data
HelloWorld db 'Hello World',13,10,0
Why is this?
FYI the command used to build this is nasm -f bin input.asm -o output.bin
Update 2 I've twigged that 0x8400
is 0x7c00 + 0x0800
, where 8 is the offset of HelloWorld
from the beginning of the output - I noticed this when I spotted that when using org 0
the address used is 0x0800
.
I still don't understand whats going on though - spotting this has just made me more confused!
As requested, disassembly using ndisasm
:
00000000 B80000 mov ax,0x0
00000003 8ED8 mov ds,ax
00000005 BE0084 mov si,0x8400
00000008 48 dec ax
00000009 656C gs insb
0000000B 6C insb
0000000C 6F outsw
0000000D 20576F and [bx+0x6f],dl
00000010 726C jc 0x7e
00000012 640D0A00 fs or ax,0xa
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
升级您的 nasm 副本。
使用 nasm 2.09rc1 我得到以下(意外的)反汇编:
使用 nasm 2.09.08 我得到以下(预期的)反汇编:
我猜它是一个候选版本是有原因的......:)
Upgrade your copy of nasm.
Using nasm 2.09rc1 I get the following (unexpected) disassembly:
Using nasm 2.09.08 I get the following (expected) disassembly:
I guess it was a release candidate for a reason... :)
除非您使用 bin 格式,否则 nasm 可以将数据移动到段 .data 中,这在编译为 PE 格式(例如 .EXE)时非常有意义。
换句话说,一旦输出二进制文件被布局并链接,您确定
0x8400
不是正确的地址吗?我知道您正在尝试在segment.text
中发出数据 - 为此,我认为您需要bin
指令。编辑:
鉴于您使用的是
bin
格式,并考虑到在segment.data
中构建HelloWorld
字符串确实有效的附加信息,我怀疑你需要做的是:我可能不明白语法——自从我用 16 位 x86 编码以来已经有很多年了——但重点是你得到的偏移量是基于对 < 值的假设code>ds,您正在显式清除它,汇编器可能会假定它具有
segment.code
或类似值。 (感谢 Aaron 将我的 mov 更正为 lea。)Unless you use
bin
format, nasm is allowed to move your data into asegment .data
This makes a lot of sense when compiling to a PE format such as .EXE.In other words, are you certain that
0x8400
is not the proper address once the output binary has been laid out and linked? I understand you are trying to emit data in thesegment .text
-- to do that, I think you need thebin
directive.Edit:
Given that you are using the
bin
format, and considering your additional information that building theHelloWorld
string insegment .data
does work, I suspect what you need to do is:I may be off on the syntax -- it's been years since I coded in 16-bit x86 -- but the point is that you're getting an offset based on an assumption about the value of
ds
, which you are explicitly clearing and which the assembler might assume has the value ofsegment .code
or similar. (Thanks to Aaron for correcting my mov to an lea.)来自 MASM 帮助:
因此,您有代码段 CS 和数据段 DS 并且它们不相等,因此标签指针也不同,具体取决于部分。
在 x86 下,节对齐通常为 4096 字节,适合内存页的大小。
From MASM help:
So, you have code segment CS and data segment DS and they are not equal, therefor also label pointers are different, depend of section.
Under x86 the section alignment is usually 4096 bytes which fit the size of a memory page.
嗯...“H”是 0x48。也许您正在提取“Hello World”的第一个字节而不是它的地址。
Hmm... 'H' is 0x48. Maybe you're pulling the first byte of 'Hello World' instead of the address of it.