在 GNU C 内联汇编中编写 Linux int 80h 系统调用包装器
我正在尝试使用内联汇编... 我阅读了此页面 http://www.codeproject.com/KB/cpp/edujini_inline_asm.aspx 但我无法理解传递给我的函数的参数。
我正在写一个 C 编写示例..这是我的函数头:
write2(char *str, int len){
}
这是我的汇编代码:
global write2
write2:
push ebp
mov ebp, esp
mov eax, 4 ;sys_write
mov ebx, 1 ;stdout
mov ecx, [ebp+8] ;string pointer
mov edx, [ebp+12] ;string size
int 0x80 ;syscall
leave
ret
我必须做什么才能将该代码传递给 C 函数...我正在做这样的事情:
write2(char *str, int len){
asm ( "movl 4, %%eax;"
"movl 1, %%ebx;"
"mov %1, %%ecx;"
//"mov %2, %%edx;"
"int 0x80;"
:
: "a" (str), "b" (len)
);
}
那是因为我没有输出变量,那么我该如何处理呢? 另外,使用这段代码:
global main
main:
mov ebx, 5866 ;PID
mov ecx, 9 ;SIGKILL
mov eax, 37 ;sys_kill
int 0x80 ;interruption
ret
我怎样才能将该代码内联到我的代码中..这样我就可以向用户询问pid..像这样.. 这是我的预编码
void killp(int pid){
asm ( "mov %1, %%ebx;"
"mov 9, %%ecx;"
"mov 37, %%eax;"
:
: "a" (pid) /* optional */
);
}
I'm trying to use inline assembly...
I read this page http://www.codeproject.com/KB/cpp/edujini_inline_asm.aspx but I can't understand the parameters passing to my function.
I'm writing a C write example.. this is my function header:
write2(char *str, int len){
}
And this is my assembly code:
global write2
write2:
push ebp
mov ebp, esp
mov eax, 4 ;sys_write
mov ebx, 1 ;stdout
mov ecx, [ebp+8] ;string pointer
mov edx, [ebp+12] ;string size
int 0x80 ;syscall
leave
ret
What do I have to do pass that code to the C function... I'm doing something like this:
write2(char *str, int len){
asm ( "movl 4, %%eax;"
"movl 1, %%ebx;"
"mov %1, %%ecx;"
//"mov %2, %%edx;"
"int 0x80;"
:
: "a" (str), "b" (len)
);
}
That's because I don't have an output variable, so how do I handle that?
Also, with this code:
global main
main:
mov ebx, 5866 ;PID
mov ecx, 9 ;SIGKILL
mov eax, 37 ;sys_kill
int 0x80 ;interruption
ret
How can I put that code inline in my code.. so I can ask for the pid to the user.. like this..
This is my precode
void killp(int pid){
asm ( "mov %1, %%ebx;"
"mov 9, %%ecx;"
"mov 37, %%eax;"
:
: "a" (pid) /* optional */
);
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好吧,你没有具体说,但从你的帖子来看,你似乎正在使用 gcc 及其带有约束语法的内联汇编(其他 C 编译器具有非常不同的内联语法)。也就是说,您可能需要使用 AT&T 汇编器语法而不是 Intel,因为这就是 gcc 所使用的语法。
综上所述,让我们看看您的 write2 函数。首先,您不想创建堆栈帧,因为 gcc 将创建一个堆栈帧,因此如果您在 asm 代码中创建一个堆栈帧,最终将得到两个帧,事情可能会变得非常混乱。其次,由于 gcc 正在布置堆栈帧,因此您无法使用“[ebp + offset]”访问变量,因为您不知道它是如何布置的。
这就是约束的目的——你说你希望 gcc 将值放在什么地方(任何寄存器、内存、特定寄存器)并在 asm 代码中使用“%X”。最后,如果您在 asm 代码中使用显式寄存器,则需要在第三部分(在输入约束之后)列出它们,以便 gcc 知道您正在使用它们。否则,它可能会在其中一个寄存器中放入一些重要的值,而您会破坏该值。
您还需要告诉编译器内联汇编将或可能从输入操作数指向的内存中读取或写入;这是不暗示的。
因此,您的 write2 函数如下所示:
请注意 AT&T 语法 - src、dest 而不是 dest、src 和寄存器名称之前的
%
。现在这可以工作了,但是效率很低,因为它会包含很多额外的 movs。一般来说,您不应该在 asm 代码中使用 mov 指令或显式寄存器,因为您最好使用约束来说明您想要的东西并让编译器确保它们在那里。这样,优化器可能可以摆脱大部分 mov,特别是如果它内联函数(如果您指定 -O3,它将执行此操作)。方便的是,i386 机器模型对特定寄存器有限制,因此您可以这样做:
或者甚至更好
还要注意使用
易失性
,它需要告诉编译器这不能被消除为死亡即使它的输出(没有)没有被使用。 (没有输出操作数的asm
已经是隐式的易失性
,但是当真正的目的不是计算某些东西时,将其显式化并没有什么坏处;它是为了产生像这样的副作用系统调用。)编辑
最后一点要注意的是——这个函数正在执行一个 write 系统调用,它会在 eax 中返回一个值——要么是写入的字节数,要么是错误代码。因此,您可以通过输出约束来实现:
所有系统调用都以 EAX 形式返回。从
-4095
到-1
(含)的值是负errno
代码,其他值是非错误。 (这适用于全局的所有 Linux 系统调用)。如果您正在编写通用系统调用包装器,则可能需要一个“内存”破坏器,因为不同的系统调用具有不同的指针操作数,并且可能是输入或输出。请参阅 https://godbolt.org/z/GOXBue 获取如果省略则中断的示例,以及此答案了解有关虚拟内存输入/输出的更多详细信息。
对于此输出操作数,您需要显式的
易失性
—— 每次asm
语句在源代码中“运行”时,正好有一个write
系统调用。否则,编译器可以假设它的存在只是为了计算其返回值,并且可以消除使用相同输入的重复调用,而不是编写多行。 (或者如果您没有检查返回值,则将其完全删除。)Well, you don't say specifically, but by your post, it appears like you're using gcc and its inline asm with constraints syntax (other C compilers have very different inline syntax). That said, you probably need to use AT&T assembler syntax rather than Intel, as that's what gets used with gcc.
So with the above said, lets look at your write2 function. First, you don't want to create a stack frame, as gcc will create one, so if you create one in the asm code, you'll end up with two frames, and things will probably get very confused. Second, since gcc is laying out the stack frame, you can't access vars with "[ebp + offset]" as you don't know how it's being laid out.
That's what the constraints are for -- you say what kind of place you want gcc to put the value (any register, memory, specific register) and the use "%X" in the asm code. Finally, if you use explicit registers in the asm code, you need to list them in the 3rd section (after the input constraints) so gcc knows you are using them. Otherwise it might put some important value in one of those registers, and you'd clobber that value.
You also need to tell the compiler that inline asm will or might read from or write to memory pointed-to by the input operands; that is not implied.
So with all that, your write2 function looks like:
Note the AT&T syntax -- src, dest rather than dest, src and
%
before the register name.Now this will work, but its inefficient as it will contain lots of extra movs. In general, you should NEVER use mov instructions or explicit registers in asm code, as you're much better off using constraints to say where you want things and let the compiler ensure that they're there. That way, the optimizer can probably get rid of most of the movs, particularly if it inlines the function (which it will do if you specify -O3). Conveniently, the i386 machine model has constraints for specific registers, so you can instead do:
or even better
Note also the use of
volatile
which is needed to tell the compiler that this can't be eliminated as dead even though its outputs (of which there are none) are not used. (asm
with no output operands is already implicitlyvolatile
, but making it explicit doesn't hurt when the real purpose isn't to calculate something; it's for a side effect like a system call.)edit
One final note -- this function is doing a write system call, which does return a value in eax -- either the number of bytes written or an error code. So you can get that with an output constraint:
All system calls return in EAX. Values from
-4095
to-1
(inclusive) are negativeerrno
codes, other values are non-errors. (This applies globally to all Linux system calls).If you're writing a generic system-call wrapper, you probably need a
"memory"
clobber because different system calls have different pointer operands, and might be inputs or outputs. See https://godbolt.org/z/GOXBue for an example that breaks if you leave it out, and this answer for more details about dummy memory inputs/outputs.With this output operand, you need the explicit
volatile
-- exactly onewrite
system call per time theasm
statement "runs" in the source. Otherwise the compiler is allowed to assume that it exists only to compute its return value, and can eliminate repeated calls with the same input instead of writing multiple lines. (Or remove it entirely if you didn't check the return value.)