为什么 Intel Pin 无法检测开放系统调用?

发布于 2025-01-11 22:36:06 字数 4458 浏览 4 评论 0原文

我正在尝试构建一个 pintool,它应该能够检测针对特定文件/目录的 open() 系统调用,并将文件路径参数替换为另一个值。

例如,下面是我想要检测的非常简单的代码:

    #include <iostream>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    
    using namespace std;
    
    int main(int argc, char **argv)
    {
        int i = open("/home/preet_derasari/important.txt", O_RDONLY);
        cout << "fid: " << i << endl;
    }

在本示例中,我希望 Pin 将文件路径从 /home/preet_derasari/important.txt 更改为 /home/preet_derasari /dummy.txt。 为了做到这一点,我在参考了一些示例 pintools 和 Pin API 后编写了一个非常简单的 pintool:

    #include "pin.H"
    #include <iostream>
    #include <fstream>
    #include <syscall.h>
    #include <string>
    using namespace std;
    
    INT32 Usage()
    {
        cout << "This tool prints out the number of dynamically executed " << endl
             << "instructions, basic blocks and threads in the application." << endl
             << endl;
    
        cout << KNOB_BASE::StringKnobSummary() << endl;
    
        return -1;
    }
    
    void SyscallEntry(THREADID threadIndex, CONTEXT *ctxt, SYSCALL_STANDARD std, void *v)
    {
        ADDRINT sysNum = PIN_GetSyscallNumber(ctxt, std);
        cout << "entered syscall: " << sysNum << endl;
        if(sysNum == SYS_open)
        {
            cout << "open encountered!" << endl;
            char *path = (char *)PIN_GetSyscallArgument(ctxt, std, 0);
            cout << "Original File Path: " << path << endl;
            int match = strcmp((char *)PIN_GetSyscallArgument(ctxt, std, 0), "/home/preet_derasari/important.txt");
            if(!match)
            {
                string pathDummy = "/home/preet_derasari/dummy.txt";
                PIN_SetSyscallArgument (ctxt, std, 0, (ADDRINT) pathDummy.c_str());
                cout << "Dummy File Path: " << pathDummy << endl;
            }
        }
    }
    
    int main(int argc, char* argv[])
    {
        cout << "Open Syscall Value: " << SYS_open << endl;
    
        if (PIN_Init(argc, argv))
        {
            return Usage();
        }
    
        cout << "===============================================" << endl;
        cout << "This application is instrumented by MyPinTool" << endl;
        cout << "===============================================" << endl;
    
        PIN_AddSyscallEntryFunction(SyscallEntry, 0);
    
        // Start the program, never returns
        PIN_StartProgram();
    
        return 0;
    }

我使用以下命令运行 pintool:../../../pin -t obj-intel64/MY_pin。 so -- test 其中 MY_pin.so 是 pintool 共享对象库,test 是上面提到的示例代码。

输出让我感到困惑,因为 Pin 正在检测除 open 之外的所有系统调用:

    Open Syscall Value: 2
    ===============================================
    This application is instrumented by MyPinTool
    ===============================================
    entered syscall: 12
    entered syscall: 158
    entered syscall: 21
    entered syscall: 257
    entered syscall: 5
    entered syscall: 9
    entered syscall: 3
    entered syscall: 257
    entered syscall: 0
    entered syscall: 17
    entered syscall: 17
    entered syscall: 17
    entered syscall: 5
    entered syscall: 9
    entered syscall: 17
    entered syscall: 17
    entered syscall: 17
    entered syscall: 9
    entered syscall: 9
    entered syscall: 9
    entered syscall: 9
    entered syscall: 9
    entered syscall: 3
    entered syscall: 158
    entered syscall: 10
    entered syscall: 10
    entered syscall: 10
    entered syscall: 11
    entered syscall: 12
    entered syscall: 12
    entered syscall: 257
    entered syscall: 5
    entered syscall: 9
    entered syscall: 3
    entered syscall: 3

如您所见,pin 检测了除 open 之外的所有系统调用,即系统调用号 2(基于 x86_64) ISA)。

一个有趣的观察是,该程序没有从我的测试程序输出 cout (cout << "fid: " < i << endl;) 这让我怀疑 Pin 是否对 open 系统调用做了一些奇怪的事情?

规格:

  • 引脚版本 - pin-3.21-98484-e7cd811fd
  • gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
  • ISA: x86_64
  • CPU: AMD Ryzen 7 1700X 八核处理器

有人可以帮我理解为什么会发生这种情况吗?

I am trying to build a pintool that should be able to instrument an open() syscall that targets a specific file/directory and replace the file path argument with another value.

For example, here is a very simple code that I want to instrument:

    #include <iostream>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    
    using namespace std;
    
    int main(int argc, char **argv)
    {
        int i = open("/home/preet_derasari/important.txt", O_RDONLY);
        cout << "fid: " << i << endl;
    }

In this example I want Pin to change the file path from /home/preet_derasari/important.txt to /home/preet_derasari/dummy.txt.
In order to do this I wrote a very simple pintool after referring to some example pintools and Pin APIs:

    #include "pin.H"
    #include <iostream>
    #include <fstream>
    #include <syscall.h>
    #include <string>
    using namespace std;
    
    INT32 Usage()
    {
        cout << "This tool prints out the number of dynamically executed " << endl
             << "instructions, basic blocks and threads in the application." << endl
             << endl;
    
        cout << KNOB_BASE::StringKnobSummary() << endl;
    
        return -1;
    }
    
    void SyscallEntry(THREADID threadIndex, CONTEXT *ctxt, SYSCALL_STANDARD std, void *v)
    {
        ADDRINT sysNum = PIN_GetSyscallNumber(ctxt, std);
        cout << "entered syscall: " << sysNum << endl;
        if(sysNum == SYS_open)
        {
            cout << "open encountered!" << endl;
            char *path = (char *)PIN_GetSyscallArgument(ctxt, std, 0);
            cout << "Original File Path: " << path << endl;
            int match = strcmp((char *)PIN_GetSyscallArgument(ctxt, std, 0), "/home/preet_derasari/important.txt");
            if(!match)
            {
                string pathDummy = "/home/preet_derasari/dummy.txt";
                PIN_SetSyscallArgument (ctxt, std, 0, (ADDRINT) pathDummy.c_str());
                cout << "Dummy File Path: " << pathDummy << endl;
            }
        }
    }
    
    int main(int argc, char* argv[])
    {
        cout << "Open Syscall Value: " << SYS_open << endl;
    
        if (PIN_Init(argc, argv))
        {
            return Usage();
        }
    
        cout << "===============================================" << endl;
        cout << "This application is instrumented by MyPinTool" << endl;
        cout << "===============================================" << endl;
    
        PIN_AddSyscallEntryFunction(SyscallEntry, 0);
    
        // Start the program, never returns
        PIN_StartProgram();
    
        return 0;
    }

I run the pintool with this command: ../../../pin -t obj-intel64/MY_pin.so -- test where MY_pin.so is the pintool shared object library and test is the sample code mentioned above.

The output just baffles me because Pin is instrumenting all syscalls except open:

    Open Syscall Value: 2
    ===============================================
    This application is instrumented by MyPinTool
    ===============================================
    entered syscall: 12
    entered syscall: 158
    entered syscall: 21
    entered syscall: 257
    entered syscall: 5
    entered syscall: 9
    entered syscall: 3
    entered syscall: 257
    entered syscall: 0
    entered syscall: 17
    entered syscall: 17
    entered syscall: 17
    entered syscall: 5
    entered syscall: 9
    entered syscall: 17
    entered syscall: 17
    entered syscall: 17
    entered syscall: 9
    entered syscall: 9
    entered syscall: 9
    entered syscall: 9
    entered syscall: 9
    entered syscall: 3
    entered syscall: 158
    entered syscall: 10
    entered syscall: 10
    entered syscall: 10
    entered syscall: 11
    entered syscall: 12
    entered syscall: 12
    entered syscall: 257
    entered syscall: 5
    entered syscall: 9
    entered syscall: 3
    entered syscall: 3

As you can see pin instruments all syscalls except open i.e., syscall number 2 (based on x86_64 ISA).

An interesting observation is that the program doesn't output the cout from my test program (cout << "fid: " << i << endl;) which makes me question if Pin is doing something weird with the open syscall?

Specifications:

  • Pin version - pin-3.21-98484-e7cd811fd
  • gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
  • ISA: x86_64
  • CPU: AMD Ryzen 7 1700X Eight-Core Processor

Can someone please help me understand why this is happening?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

扛刀软妹 2025-01-18 22:36:08

strace cat foo 向您显示程序不再使用旧的 open(2) 系统调用:

...
openat(AT_FDCWD, "foo", O_RDONLY)       = 3
...

__NR_openat 是 257,您的 PIN 工具观察3次。显然,甚至 open() libc 包装函数在内部也使用 openat Linux 系统调用。 (__NR_open = 2 系统调用仍然有效;内核还具有将其参数传递给当前实现的代码。IDK 效率更高,就像它可能只是设置一个 AT_FDCWD< /code> arg 并调用 sys_openat() ,它必须再次解码它,就像 glibc 在用户空间中所做的那样。)


open(2) 手册页还记录了 openat(2)

dirfd 参数与路径名结合使用
论证如下:

  • 如果路径名中给出的路径名是绝对路径名,则 dirfd 是
    被忽略。

  • 如果 pathname 中给出的路径名是相对路径且 dirfd 是
    特殊值AT_FDCWD,则路径名被解释为相对路径
    到调用进程的当前工作目录(
    打开()
    )。

  • ...

openat / linkat 等等,当与 open(O_DIRECTORY) 中的 fd 一起使用时,让像 find 这样的程序可以避免 TOCTOU 竞争,和/或让多线程程序避免实际 chdir (因为每个进程只有一个 CWD,而不是每个线程。

)和AT_FDCWD 与旧式 open(2) 相比没有优点或缺点。

strace cat foo shows you that programs don't use the old open(2) system call anymore:

...
openat(AT_FDCWD, "foo", O_RDONLY)       = 3
...

__NR_openat is 257, which your PIN tool observed 3 times. Apparently even the open() libc wrapper function internally uses the openat Linux system call. (The __NR_open = 2 system call does still work; the kernel also has code to pass its args to the current implementation. IDK which is more efficient, like maybe it just sets up an AT_FDCWD arg and calls sys_openat() which has to decode it again, just like glibc does in user-space.)


The open(2) man page also documents openat(2).

The dirfd argument is used in conjunction with the pathname
argument as follows:

  • If the pathname given in pathname is absolute, then dirfd is
    ignored.

  • If the pathname given in pathname is relative and dirfd is the
    special value AT_FDCWD, then pathname is interpreted relative
    to the current working directory of the calling process (like
    open()
    ).

  • ...

openat / linkat and so on, when used with an fd from open(O_DIRECTORY), let programs like find avoid TOCTOU races, and/or let multi-threaded programs avoid having to actually chdir (because there's only one CWD per process, not per thread.)

Using them with AT_FDCWD has no advantage or disadvantage vs. old-style open(2).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文