格式字符串漏洞 - printf

发布于 2024-11-01 16:40:09 字数 403 浏览 6 评论 0 原文

为什么会打印0x08480110处内存地址的值?我不确定为什么有 5 %08x 参数 - 这会将您带到堆栈的哪里?

address = 0x08480110
address (encoded as 32 bit le string): "\x10\x01\x48\x08"
printf ("\x10\x01\x48\x08_%08x.%08x.%08x.%08x.%08x|%s|");

此示例取自本文第 11 页 http://crypto.stanford.edu /cs155/papers/formatstring-1.2.pdf

Why does this print the value of the memory address at 0x08480110? I'm not sure why there are 5 %08x arguments - where does that take you up the stack?

address = 0x08480110
address (encoded as 32 bit le string): "\x10\x01\x48\x08"
printf ("\x10\x01\x48\x08_%08x.%08x.%08x.%08x.%08x|%s|");

This example is taken from page 11 of this paper http://crypto.stanford.edu/cs155/papers/formatstring-1.2.pdf

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

陌伤ぢ 2024-11-08 16:40:09

我认为本文提供的 printf() 示例有点令人困惑,因为这些示例使用字符串文字作为格式字符串,而这些示例通常不允许所描述的漏洞类型。此处描述的格式字符串漏洞取决于用户输入提供的格式字符串。

因此,示例:

printf ("\x10\x01\x48\x08_%08x.%08x.%08x.%08x.%08x|%s|");

最好呈现为:

/* 
 * in a real program, some user input source would be copied 
 * into the `outstring` buffer 
 */
char outstring[80] = "\x10\x01\x48\x08_%08x.%08x.%08x.%08x.%08x|%s|";

printf(outstring);

由于 outstring 数组是自动数组,因此编译器可能会将其放入堆栈中。将用户输入复制到 outstring 数组后,堆栈上的“单词”将如下所示(假设为小端):

outstring[0c]               // etc...
outstring[08] 0x30252e78    // from "x.%0"
outstring[04] 0x3830255f    // from "_%08"
outstring[00] 0x08480110    // from the ""\x10\x01\x48\x08"

编译器会将其他项目放入堆栈中,因为它认为合适(其他局部变量,保存的寄存器,等等)。

当即将进行 printf() 调用时,堆栈可能如下所示:

outstring[0c]               // etc...
outstring[08] 0x30252e78    // from "x.%0"
outstring[04] 0x3830255f    // from "_%08"
outstring[00] 0x08480110    // from the ""\x10\x01\x48\x08"
var1
var2
saved ECX
saved EDI

请注意,我完全制作了这些条目 - 每个编译器都会以不同的方式使用堆栈(因此格式字符串漏洞必须针对特定的具体场景进行定制,换句话说,您不会总是像本示例中那样使用 5 个虚拟格式说明符 - 作为攻击者,您需要弄清楚特定漏洞需要多少个虚拟格式说明符。

现在要打电话printf(),参数(outstring的地址)被压入堆栈并调用printf(),因此参数区域堆栈的看起来像:

outstring[0c]               // etc...
outstring[08] 0x30252e78    // from "x.%0"
outstring[04] 0x3830255f    // from "_%08"
outstring[00] 0x08480110    // from the ""\x10\x01\x48\x08"
var1
var2
var3
saved ECX
saved EDI
&outstring   // the one real argument to `printf()`

然而, printf 并不真正知道有多少参数已经被放置在堆栈上 - 它通过在格式字符串中找到的格式说明符(它“肯定”得到的一个参数) )。 所以printf() 获取格式字符串参数并开始处理它,当它到达与我的示例中的“已保存的 EDI”相对应的第一个“%08x”时,下一个“%08x”将被处理。打印
保存了ECX'等。因此,“%08x”格式说明符只是消耗堆栈上的数据,直到它返回到攻击者能够输入的字符串。攻击者可以通过一种反复试验来确定需要多少个格式(可能是通过测试运行一系列“%08x”格式,直到他可以“看到”格式字符串的开始位置)。

无论如何,当printf()开始处理“%s”格式说明符时,它已经消耗了outstring缓冲区所在的所有堆栈条目。 “%s”说明符将其堆栈条目视为指针,并且用户放入该缓冲区的字符串经过精心设计,具有 0x08480110 的二进制表示形式,因此 printf( ) 将以 ASCIIZ 字符串的形式打印该地址处的所有内容。

I think that the paper provides its printf() examples in a somewhat confusing way because the examples use string literals for format strings, and those don't generally permit the type of vulnerability being described. The format string vulnerability as described here depends on the format string being provided by user input.

So the example:

printf ("\x10\x01\x48\x08_%08x.%08x.%08x.%08x.%08x|%s|");

Might better be presented as:

/* 
 * in a real program, some user input source would be copied 
 * into the `outstring` buffer 
 */
char outstring[80] = "\x10\x01\x48\x08_%08x.%08x.%08x.%08x.%08x|%s|";

printf(outstring);

Since the outstring array is an automatic, the compiler will likely put it on the stack. After copying the user input to the outstring array, it'll look like the following as 'words' on the stack (assuming little endian):

outstring[0c]               // etc...
outstring[08] 0x30252e78    // from "x.%0"
outstring[04] 0x3830255f    // from "_%08"
outstring[00] 0x08480110    // from the ""\x10\x01\x48\x08"

The compiler will put other items on the stack as it sees fit (other local variables, saved registers, whatever).

When the printf() call is about to be made, the stack might look like:

outstring[0c]               // etc...
outstring[08] 0x30252e78    // from "x.%0"
outstring[04] 0x3830255f    // from "_%08"
outstring[00] 0x08480110    // from the ""\x10\x01\x48\x08"
var1
var2
saved ECX
saved EDI

Note that I'm completely making those entries up - each compiler will use the stack in different ways (so a format string vulnerability has to be custom crafted for a particular exact scenario. In other words, you won't always use 5 dummy format specifiers like in this example - as the attacker you'd need to figure out how many dummies the particular vulnerability would need.

Now to call printf(), the argument (the address of outstring) is pushed on to the stack and printf() is called, so the argument area of the stack looks like:

outstring[0c]               // etc...
outstring[08] 0x30252e78    // from "x.%0"
outstring[04] 0x3830255f    // from "_%08"
outstring[00] 0x08480110    // from the ""\x10\x01\x48\x08"
var1
var2
var3
saved ECX
saved EDI
&outstring   // the one real argument to `printf()`

However, printf doesn't really know anything about how many arguments have been placed on the stack for it - it goes by the format specifiers it finds in the format string (the one argument it's 'sure' to get). So printf() gets the format string argument and starts processing it. When it gets to the 1st "%08x" that will correspond to the 'saved EDI' in my example, then next "%08x" will print the
saved ECX' and so on. So the "%08x" format specifiers are just eating up data on the stack until it gets back to the string the attacker was able to input. Determining how many of those are needed is something an attacker would do by a kind of trial and error (probably by a test run that has a whole slew of "%08x" formats until he can 'see' where the format string starts).

Anyway, when printf() gets to processing the "%s" format specifier, it has consumed all the stack entries up to where the outstring buffer resides. The "%s" specifier treats its stack entry as a pointer, and the string that the user has put into that buffer has been carefully crafted to have a binary representation of 0x08480110, so printf() will print out whatever is at that address as an ASCIIZ string.

我不吻晚风 2024-11-08 16:40:09

您有 6 个格式说明符(5 个 %08x 和一个 %s),但您没有为这些格式说明符提供值。你立即陷入了未定义行为的领域——任何事情都可能发生,而且没有错误的答案。

但是,在正常的事件过程中,传递给 printf() 的值将存储在堆栈中,因此 printf() 中的代码从堆栈中读取值就好像额外的值已经被传递了一样。函数返回地址也位于堆栈中。无法保证我可以看到实际会生成值 0x08480110。这种攻击很大程度上取决于特定的程序和错误的函数调用,并且您很可能会得到一个非常不同的值。示例代码很可能是在假设 32 位 Intel(小端)CPU 的情况下编写的,而不是 64 位或大端 CPU。


警告,在 MacOS X 10.6.7 上使用 GCC 4.2.1 (XCode 3) 进行 32 位编译,以下代码:

#include <stdio.h>

static void somefunc(void)
{
    printf("AAAAAAAAAAAAAAAA.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X.|%s|\n");
}

int main(void)
{
    char buffer[160] =
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz01234";
    somefunc();
    return 0;
}

产生以下结果:

 AAAAAAAAAAAAAAAA.0x000000A0.0xBFFFF11C.0x00001EC4.0x00000000.0x00001E22.0xBFFFF1C8.0x00001E5A.|abcdefghijklmnopqrstuvwxyz012345abcdefghijklmnopqrstuvwxyz012345abcdefghijklmnopqrstuvwxyz012345abcdefghijklmnopqrstuvwxyz012345abcdefghijklmnopqrstuvwxyz01234|

调整代码片段,将其编译为完整的程序,忽略编译 可以看到,我最终从 printf() 语句中“找到”了主程序中的字符串。当我以 64 位模式编译它时,我得到了一个核心转储。两个结果都是完全正确的;程序调用未定义的行为,因此程序所做的任何操作都是有效的。如果您好奇,请搜索“鼻恶魔”以获取有关未定义行为的更多信息。

并习惯于尝试解决此类问题。


这会产生另一种变体

#include <stdio.h>

static void somefunc(void)
{
    char format[] =
        "AAAAAAAAAAAAAAAA.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X\n"
        ".0x%08X.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X\n"
        ".0x%08X.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X\n";
    printf(format, 1);
}

int main(void)
{
    char buffer[160] =
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz01234";
    somefunc();
    return 0;
}

AAAAAAAAAAAAAAAA.0x00000001.0x00000099.0x8FE467B4.0x41000024.0x41414141
.0x41414141.0x41414141.0x2E414141.0x30257830.0x302E5838.0x38302578.0x78302E58
.0x58383025.0x2578302E.0x2E583830.0x30257830.0x2E0A5838.0x30257830.0x302E5838

您可能会识别十六进制输出中的格式字符串 - 例如,0x41 是大写 A。

该代码的 64 位输出既相似又不同:

AAAAAAAAAAAAAAAA.0x00000001.0x00000000.0x00000000.0xFFE0082C.0x00000000
.0x41414141.0x41414141.0x2578302E.0x30257830.0x38302578.0x58383025.0x0A583830
.0x2E583830.0x302E5838.0x78302E58.0x2578302E.0x30257830.0x38302578.0x38302578

You have 6 format specifiers (5 lots of %08x and one of %s), but you do not provide values for those format specifiers. You immediately fall into the realm of undefined behaviour - anything could happen and there is no wrong answer.

However, in the normal course of events, the values passed to printf() would have been stored on the stack, so the code in printf() reads values off the stack as if the extra values had been passed. The function return address is on the stack, too. There is no guarantee that I can see that the value 0x08480110 will actually be produced. This sort of attack very much depends on the the specific program and faulty function call, and you might well get a very different value. The example code is most likely written assuming a 32-bit Intel (little-endian) CPU - rather than a 64-bit or big-endian CPU.


Adapting the code fragment, compiling it into a complete program, ignoring the compilation warnings, using a 32-bit compilation on MacOS X 10.6.7 with GCC 4.2.1 (XCode 3), the following code:

#include <stdio.h>

static void somefunc(void)
{
    printf("AAAAAAAAAAAAAAAA.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X.|%s|\n");
}

int main(void)
{
    char buffer[160] =
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz01234";
    somefunc();
    return 0;
}

produces the following result:

 AAAAAAAAAAAAAAAA.0x000000A0.0xBFFFF11C.0x00001EC4.0x00000000.0x00001E22.0xBFFFF1C8.0x00001E5A.|abcdefghijklmnopqrstuvwxyz012345abcdefghijklmnopqrstuvwxyz012345abcdefghijklmnopqrstuvwxyz012345abcdefghijklmnopqrstuvwxyz012345abcdefghijklmnopqrstuvwxyz01234|

As you can see, I eventually 'found' the string in the main program from the printf() statement. When I compiled it in 64-bit mode, I got a core dump instead. Both results are perfectly correct; the program invokes undefined behaviour, so anything the program does is valid. If you're curious, search for 'nasal demons' for more information on undefined behaviour.

And get used to experimenting with these sorts of issues.


Another variation

#include <stdio.h>

static void somefunc(void)
{
    char format[] =
        "AAAAAAAAAAAAAAAA.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X\n"
        ".0x%08X.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X\n"
        ".0x%08X.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X.0x%08X\n";
    printf(format, 1);
}

int main(void)
{
    char buffer[160] =
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz012345"
        "abcdefghijklmnopqrstuvwxyz01234";
    somefunc();
    return 0;
}

This produces:

AAAAAAAAAAAAAAAA.0x00000001.0x00000099.0x8FE467B4.0x41000024.0x41414141
.0x41414141.0x41414141.0x2E414141.0x30257830.0x302E5838.0x38302578.0x78302E58
.0x58383025.0x2578302E.0x2E583830.0x30257830.0x2E0A5838.0x30257830.0x302E5838

You might recognize the format string in the hex output - 0x41 is capital A, for example.

The 64-bit output from that code is both similar and different:

AAAAAAAAAAAAAAAA.0x00000001.0x00000000.0x00000000.0xFFE0082C.0x00000000
.0x41414141.0x41414141.0x2578302E.0x30257830.0x38302578.0x58383025.0x0A583830
.0x2E583830.0x302E5838.0x78302E58.0x2578302E.0x30257830.0x38302578.0x38302578
悸初 2024-11-08 16:40:09

你误解了报纸的意思。

您链接的文本假设堆栈上的当前位置是 0x08480110(查看周围的文本)。 printf() 将从堆栈上的任何位置转储数据。

格式字符串开头的 \x10\x01\x48\x08 只是将(假定的)地址打印到转储数据前面的 stdout。这些数字绝不会修改转储数据的地址。

You misunderstood the paper.

The text you linked is assuming that the current position on the stack is 0x08480110 (look at the surrounding text). The printf() will dump data from wherever on the stack you happen to be.

The \x10\x01\x48\x08 at the beginning of the format string is merely to print the (assumed) address to stdout in front of the dumped data. In no way do these numbers modify the address from which the data is dumped.

何以笙箫默 2024-11-08 16:40:09

你关于“带你上堆栈”的说法是正确的,但只是勉强正确;它依赖于参数在堆栈上传递而不是在寄存器中传递的假设。 (对于可变参数函数来说,这可能是一个安全的假设,但仍然是关于实现细节的假设。 )

每个%08x 要求以十六进制打印“下一个unsigned int 参数”; “下一个参数”位置实际发生的情况取决于体系结构和编译器。如果您将流程中获得的值与 /proc/self/maps 进行比较,您也许能够缩小某些数字的含义范围。

You're correct about "take you up the stack", but only barely; it relies on the assumption that arguments are passed on the stack, rather than in registers. (Which, for a variadic function is probably a safe assumption, but still an assumption about implementation details.)

Each %08x asks for the 'next unsigned int argument' to be printed in hex; what actually occurs in that 'next argument' location is both architecture and compiler dependent. If you compare the values you get with /proc/self/maps for the process, you might be able to narrow down what some of the numbers mean.

紫瑟鸿黎 2024-11-08 16:40:09

一点理论

如果您想了解在自定义地址写入的实际技巧,请跳至第二部分。

让我们尝试在 printf() 技巧中调整格式字符串。

printf("ABABABAB");

但是直接将十六进制地址编码为格式字符串是行不通的。重点是伪装一些地址,这些地址将被利用来攻击堆栈,但我的格式字符串“ABABABAB”以 .rodata 部分结束,而不是像我们想要的那样在堆栈中结束。

Breakpoint 1, __printf (format=0x555555556004 "ABABABAB") at ./stdio-common/printf.c:28
(gdb) i args
format = 0x555555556004 "ABABABAB"

当在进程内存映射中查找该地址时
它可能是 .rodata 部分:

      Start Addr           End Addr       Size     Offset  Perms  objfile
  0x555555554000     0x555555555000     0x1000        0x0  r--p   /home/drazen/proba/main
  0x555555555000     0x555555556000     0x1000     0x1000  r-xp   /home/drazen/proba/main
  0x555555556000     0x555555557000     0x1000     0x2000  r--p   /home/drazen/proba/main
  0x555555557000     0x555555558000     0x1000     0x2000  r--p   /home/drazen/proba/main
  0x555555558000     0x555555559000     0x1000     0x3000  rw-p   /home/drazen/proba/main

并检查 readelf:

drazen@HP-ProBook-640G1:~/proba$ readelf  -p .rodata  main 
String dump of section '.rodata':
  [     4]  ABABABAB

到目前为止还可以,但奇怪的是当我转储堆栈并期望在堆栈帧中找到 ABABABAB 字符串地址作为传递给 printf() 的参数时。

(gdb) i frame
Stack level 0, frame at 0x7fffffffddf0:
rip = 0x7ffff7de16f0 in __printf (./stdio-common/printf.c:28); saved rip = 0x555555555165
called by frame at 0x7fffffffde00
source language c.
Arglist at 0x7fffffffdde0, args: format=0x555555556004 "ABABABAB"

您可以看到 main() 0x555555555165 的返回地址,并期望在地址 0x7fffffffdde0 处的堆栈上找到格式字符串地址
但是,当我们转储堆栈而不是格式字符串地址时,在 __libc_start_call_main() 堆栈帧返回地址和 printf() 堆栈之间,函数参数应该是 8 个字节的零帧返回地址:

(gdb) x/32gx $sp
0x7fffffffdde0: 0x0000000000000000  0x0000555555555165
0x7fffffffddf0: 0x0000000000000001  0x00007ffff7daad90
0x7fffffffde00: 0x0000000000000000  0x0000555555555149
0x7fffffffde10: 0x0000000100000000  0x00007fffffffdf08

那么格式字符串的地址是如何传递给prIntf()的呢?
当我们转储寄存器时,我们在rsi寄存器中看到了格式字符串地址。

(gdb) i r
rax            0x7ffff7f9b868      140737353726056
rbx            0x0                 0
rcx            0x0                 0
rdx            0x7fffffffdcf0      140737488346352
rsi            0x555555556004      93824992239624
rdi            0x7ffff7f9b780      140737353725824

因为函数参数(在本例中为字符串地址)出于速度目的将在 rsirdi 寄存器中传递,而不是在堆栈中传递,所以我们不能使用格式字符串和字符串参数这招。

因此,我们可以使用创建为本地(自动)变量的字符串来将其放入堆栈中,在当前堆栈帧的返回地址之前。


实际示例

无论如何,我尝试了这个小示例,它起作用了,打印出放入本地字符串中的地址(在堆栈上创建)。因此,我们可以使用这个技巧使本地字符串模仿我们想要访问的地址:

Sample code

我们必须打印 5 个随机值,直到达到我们想要的本地字符串!

使用十六进制格式 %x 在堆栈上显示字符串 avronanaloli 的十六进制表示(使用 %s 字符串格式会导致分段错误,因为 printf () 会将这些值解释为字符串的地址,但这些“地址”可能不在进程的映射区域或受保护的内存区域中):

Output

所以现在我们使用堆栈上的局部变量来“伪装”为数据使用权。
但是如果我们可以用它来尝试在该地址上写入呢?

让我们将最后一个 %X 格式说明符更改为 %n。
我们将使用该数据作为变量的地址,其中 printf() 存储已打印的字符数,而不是使用 %X 在堆栈上打印数据内容。
所以想法是获得对自定义地址的写访问权限。

printf("ABABABAB\n,%016llX\n,%016llX\n,%016llX\n,%016llX\n,%016llX\n,%016llX\n,%016llX\n,%n");

我们的假地址 0x61616161616161 表示为 ASCII“aaaaaaa”,以 %rax 寄存器结尾,printf 将在此写入
地址已打印的字符数(存储在 r12 中):

(gdb) i r
rax            0x61616161616161    27410143614427489
rbx            0x555555556052      93824992239698


      0x00007ffff7df7c3c <+7180>:   jne    0x7ffff7df8276 <__vfprintf_internal+8774>
   => 0x00007ffff7df7c42 <+7186>:   mov    %r12d,(%rax)

但在我们的例子中,这将使用 SEGV 分段错误,因为地址 0x61616161616161 未映射到进程内存中。

Continuing.
ABABABAB
,00007FFFFFFFDF08
,00007FFFFFFFDF18
,0000555555557DB8
,00007FFFF7F9BF10
,00007FFFF7FC9040
,0031313131313131
,0032323232323232
Program received signal SIGSEGV, Segmentation fault.

我希望这有帮助!

A little theory

If you want to see actual trick to write at custom address jump to second part.

Lets try tweaking format string in printf() trick.

printf("ABABABAB");

But encoding a HEX address into a format string directly was not working. WHole point is masquerading some address which would be exploited for attack into stack, but my format string "ABABABAB" ended in .rodata section and nor in Stack as we wanted to.

Breakpoint 1, __printf (format=0x555555556004 "ABABABAB") at ./stdio-common/printf.c:28
(gdb) i args
format = 0x555555556004 "ABABABAB"

When this address is looked for in process memory map
it is probably .rodata section:

      Start Addr           End Addr       Size     Offset  Perms  objfile
  0x555555554000     0x555555555000     0x1000        0x0  r--p   /home/drazen/proba/main
  0x555555555000     0x555555556000     0x1000     0x1000  r-xp   /home/drazen/proba/main
  0x555555556000     0x555555557000     0x1000     0x2000  r--p   /home/drazen/proba/main
  0x555555557000     0x555555558000     0x1000     0x2000  r--p   /home/drazen/proba/main
  0x555555558000     0x555555559000     0x1000     0x3000  rw-p   /home/drazen/proba/main

and check with readelf:

drazen@HP-ProBook-640G1:~/proba$ readelf  -p .rodata  main 
String dump of section '.rodata':
  [     4]  ABABABAB

So far OK, but weird part is when I dumped stack and expected to find ABABABAB string address in stack frame as argument passed to printf().

(gdb) i frame
Stack level 0, frame at 0x7fffffffddf0:
rip = 0x7ffff7de16f0 in __printf (./stdio-common/printf.c:28); saved rip = 0x555555555165
called by frame at 0x7fffffffde00
source language c.
Arglist at 0x7fffffffdde0, args: format=0x555555556004 "ABABABAB"

you can see return address to main() 0x555555555165, and expect to find format string address on stack at address 0x7fffffffdde0
But when we dump stack instead of format string address there is just 8 bytes of zeros where function argument should be, between __libc_start_call_main() stack frame return address and printf() stack frame return address:

(gdb) x/32gx $sp
0x7fffffffdde0: 0x0000000000000000  0x0000555555555165
0x7fffffffddf0: 0x0000000000000001  0x00007ffff7daad90
0x7fffffffde00: 0x0000000000000000  0x0000555555555149
0x7fffffffde10: 0x0000000100000000  0x00007fffffffdf08

So how is address of format string passed to prIntf()?
When we dumped registers we saw format string address in rsi register.

(gdb) i r
rax            0x7ffff7f9b868      140737353726056
rbx            0x0                 0
rcx            0x0                 0
rdx            0x7fffffffdcf0      140737488346352
rsi            0x555555556004      93824992239624
rdi            0x7ffff7f9b780      140737353725824

Because function arguments (string address in this case) will be passed in rsi and rdi registers for purpose of speed and not in the stack we cant use format string and string arguments for this trick.

So we can just use strings created as local (automatic) variables to be put in stack, before return address in current stack frame.


Actual example

Anyway I tried this small example and it worked, printed out addresses put in local strings (created on stack). So we could use this trick to make local strings mimic addresses we want to access:

Sample code

We have to print 5 random values until we reached what we wanted, our local strings!

Using hexadecimal format %x showed HEX representation of strings avro, nana, loli on stack (using %s string format would cause segmentation fault because printf() would interpret those values as addresses of strings but those "addresses" are probably not in mapped area of the process or are in protected memory area):

Output

So now we used local variables on stack to "masquerade" as data access.
But what if we can use this to try to write on that address?

Lets change last %X format specifier to %n.
Instead of printing content of data on stack with %X, we will use this data as address of variable where printf() stores number of characters already printed.
So idea is to gain write access to custom address.

printf("ABABABAB\n,%016llX\n,%016llX\n,%016llX\n,%016llX\n,%016llX\n,%016llX\n,%016llX\n,%n");

Our FAKE address 0x61616161616161 represented as ASCII "aaaaaaa" ends in %rax register, and printf will write at this
address number of characters already printed (stored in r12):

(gdb) i r
rax            0x61616161616161    27410143614427489
rbx            0x555555556052      93824992239698


      0x00007ffff7df7c3c <+7180>:   jne    0x7ffff7df8276 <__vfprintf_internal+8774>
   => 0x00007ffff7df7c42 <+7186>:   mov    %r12d,(%rax)

But in our case this will use SEGV segmentation fault since address 0x61616161616161 is not mapped into process memory.

Continuing.
ABABABAB
,00007FFFFFFFDF08
,00007FFFFFFFDF18
,0000555555557DB8
,00007FFFF7F9BF10
,00007FFFF7FC9040
,0031313131313131
,0032323232323232
Program received signal SIGSEGV, Segmentation fault.

I hope this helps!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文