汇编器中的分支表 (Nasm)
我有一个关于分支表的问题。
有两种方法可以声明这样的表:
- 在数据扇区(DS)
- 在代码扇区(CS)
这两种方法有什么不同?
我通过以下示例了解到了这一点: 案例一:
SECTION .data
i dd 2;
stab dd m1,m2,m3 ; branchtable for switch
SECTION .text
global start
start:
mov ebx , [ i ] ; switch ( i )
cmp ebx , 1 ;
jl end
cmp ebx , 3
jg end
shl ebx , 2 ; / ∗ stab 4 Bytes ∗ /
jmp [ stab+ebx −4];
m1: ;do something.....
....
案例2:
SECTION .data
i dd 2;
SECTION .text
global start
start:
mov ebx , [ i ] ; switch ( i )
cmp ebx , 1 ;
jl end
cmp ebx , 3
jg end
shl ebx , 2 ; / ∗ stab 4 Bytes ∗ /
jmp [ cs : ebx+stab −4]; branchtable in codesegment
ALIGN 4 ;
stab dd m1,m2,m3
m1: ; do something
....
我们的教授告诉我们,方法2更有效,但为什么呢?因为到分支表只是一个短跳转,我们不需要在 DS 中显示?
迎接命运
i have a question about the branchtables.
There are two ways to declare such a table:
- in the Data Sector (DS)
- in the Code Sector (CS)
Whats the different between this methods?
I've learned it this the following examples:
Case 1:
SECTION .data
i dd 2;
stab dd m1,m2,m3 ; branchtable for switch
SECTION .text
global start
start:
mov ebx , [ i ] ; switch ( i )
cmp ebx , 1 ;
jl end
cmp ebx , 3
jg end
shl ebx , 2 ; / ∗ stab 4 Bytes ∗ /
jmp [ stab+ebx −4];
m1: ;do something.....
....
Case 2:
SECTION .data
i dd 2;
SECTION .text
global start
start:
mov ebx , [ i ] ; switch ( i )
cmp ebx , 1 ;
jl end
cmp ebx , 3
jg end
shl ebx , 2 ; / ∗ stab 4 Bytes ∗ /
jmp [ cs : ebx+stab −4]; branchtable in codesegment
ALIGN 4 ;
stab dd m1,m2,m3
m1: ; do something
....
Our prof told us, that method 2 is more effectiv but why? Because to the branchtable it's only a short jump and we doesn't need to show in the DS?
greetz destiny
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
哪种方法更有效取决于您正在处理的处理器,但是,我请求与您的教授不同,使用
CS
需要覆盖段前缀,使代码更大,因此处理时间更长并且更少缓存友好。但在 x86 windows(用户态)上,CS 和 DS 展平到相同的线性地址空间,这使其成为一种没有实际意义的优化。当段基数非零时,某些处理器(Intel Atom)对
CS
的访问速度也较慢,但在 x64 下,由于除FS
和之外的所有段,这种情况会消失。由于 x64 的平面寻址模型,>GS
被忽略(它们的基数隐式为 0)。还应该指出的是,英特尔建议使用尽可能少的段寄存器(这可以减轻寄存器重命名器的负担)。
which method is more effective depends on the processor you are dealing with, however, I beg to differ with your prof, using
CS
requires a segment prefix override, making the code bigger, thus longer to process and less cache friendly. but on x86 windows (userland),CS
andDS
flatten out to the same linear address space, making it a moot optimization.Certain processors (Intel Atom) also have slower access to
CS
when the segment base is non-zero, though under x64 this falls away as all segments apart fromFS
andGS
are ignored (their base is implicity 0), due to x64's flat addressing model.It should also be noted that Intel advises the use of as few segment registers as possible (this ease the burden on the register renamer).