为什么 Intel Pin 无法检测开放系统调用?
我正在尝试构建一个 pintool,它应该能够检测针对特定文件/目录的 open() 系统调用,并将文件路径参数替换为另一个值。
例如,下面是我想要检测的非常简单的代码:
#include <iostream>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
using namespace std;
int main(int argc, char **argv)
{
int i = open("/home/preet_derasari/important.txt", O_RDONLY);
cout << "fid: " << i << endl;
}
在本示例中,我希望 Pin 将文件路径从 /home/preet_derasari/important.txt
更改为 /home/preet_derasari /dummy.txt
。 为了做到这一点,我在参考了一些示例 pintools 和 Pin API 后编写了一个非常简单的 pintool:
#include "pin.H"
#include <iostream>
#include <fstream>
#include <syscall.h>
#include <string>
using namespace std;
INT32 Usage()
{
cout << "This tool prints out the number of dynamically executed " << endl
<< "instructions, basic blocks and threads in the application." << endl
<< endl;
cout << KNOB_BASE::StringKnobSummary() << endl;
return -1;
}
void SyscallEntry(THREADID threadIndex, CONTEXT *ctxt, SYSCALL_STANDARD std, void *v)
{
ADDRINT sysNum = PIN_GetSyscallNumber(ctxt, std);
cout << "entered syscall: " << sysNum << endl;
if(sysNum == SYS_open)
{
cout << "open encountered!" << endl;
char *path = (char *)PIN_GetSyscallArgument(ctxt, std, 0);
cout << "Original File Path: " << path << endl;
int match = strcmp((char *)PIN_GetSyscallArgument(ctxt, std, 0), "/home/preet_derasari/important.txt");
if(!match)
{
string pathDummy = "/home/preet_derasari/dummy.txt";
PIN_SetSyscallArgument (ctxt, std, 0, (ADDRINT) pathDummy.c_str());
cout << "Dummy File Path: " << pathDummy << endl;
}
}
}
int main(int argc, char* argv[])
{
cout << "Open Syscall Value: " << SYS_open << endl;
if (PIN_Init(argc, argv))
{
return Usage();
}
cout << "===============================================" << endl;
cout << "This application is instrumented by MyPinTool" << endl;
cout << "===============================================" << endl;
PIN_AddSyscallEntryFunction(SyscallEntry, 0);
// Start the program, never returns
PIN_StartProgram();
return 0;
}
我使用以下命令运行 pintool:../../../pin -t obj-intel64/MY_pin。 so -- test
其中 MY_pin.so
是 pintool 共享对象库,test 是上面提到的示例代码。
输出让我感到困惑,因为 Pin 正在检测除 open 之外的所有系统调用:
Open Syscall Value: 2
===============================================
This application is instrumented by MyPinTool
===============================================
entered syscall: 12
entered syscall: 158
entered syscall: 21
entered syscall: 257
entered syscall: 5
entered syscall: 9
entered syscall: 3
entered syscall: 257
entered syscall: 0
entered syscall: 17
entered syscall: 17
entered syscall: 17
entered syscall: 5
entered syscall: 9
entered syscall: 17
entered syscall: 17
entered syscall: 17
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 3
entered syscall: 158
entered syscall: 10
entered syscall: 10
entered syscall: 10
entered syscall: 11
entered syscall: 12
entered syscall: 12
entered syscall: 257
entered syscall: 5
entered syscall: 9
entered syscall: 3
entered syscall: 3
如您所见,pin 检测了除 open 之外的所有系统调用,即系统调用号 2(基于 x86_64) ISA)。
一个有趣的观察是,该程序没有从我的测试程序输出 cout
(cout << "fid: "
< i << endl;) 这让我怀疑 Pin 是否对 open 系统调用做了一些奇怪的事情?
规格:
- 引脚版本 - pin-3.21-98484-e7cd811fd
- gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
- ISA: x86_64
- CPU: AMD Ryzen 7 1700X 八核处理器
有人可以帮我理解为什么会发生这种情况吗?
I am trying to build a pintool that should be able to instrument an open()
syscall that targets a specific file/directory and replace the file path argument with another value.
For example, here is a very simple code that I want to instrument:
#include <iostream>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
using namespace std;
int main(int argc, char **argv)
{
int i = open("/home/preet_derasari/important.txt", O_RDONLY);
cout << "fid: " << i << endl;
}
In this example I want Pin to change the file path from /home/preet_derasari/important.txt
to /home/preet_derasari/dummy.txt
.
In order to do this I wrote a very simple pintool after referring to some example pintools and Pin APIs:
#include "pin.H"
#include <iostream>
#include <fstream>
#include <syscall.h>
#include <string>
using namespace std;
INT32 Usage()
{
cout << "This tool prints out the number of dynamically executed " << endl
<< "instructions, basic blocks and threads in the application." << endl
<< endl;
cout << KNOB_BASE::StringKnobSummary() << endl;
return -1;
}
void SyscallEntry(THREADID threadIndex, CONTEXT *ctxt, SYSCALL_STANDARD std, void *v)
{
ADDRINT sysNum = PIN_GetSyscallNumber(ctxt, std);
cout << "entered syscall: " << sysNum << endl;
if(sysNum == SYS_open)
{
cout << "open encountered!" << endl;
char *path = (char *)PIN_GetSyscallArgument(ctxt, std, 0);
cout << "Original File Path: " << path << endl;
int match = strcmp((char *)PIN_GetSyscallArgument(ctxt, std, 0), "/home/preet_derasari/important.txt");
if(!match)
{
string pathDummy = "/home/preet_derasari/dummy.txt";
PIN_SetSyscallArgument (ctxt, std, 0, (ADDRINT) pathDummy.c_str());
cout << "Dummy File Path: " << pathDummy << endl;
}
}
}
int main(int argc, char* argv[])
{
cout << "Open Syscall Value: " << SYS_open << endl;
if (PIN_Init(argc, argv))
{
return Usage();
}
cout << "===============================================" << endl;
cout << "This application is instrumented by MyPinTool" << endl;
cout << "===============================================" << endl;
PIN_AddSyscallEntryFunction(SyscallEntry, 0);
// Start the program, never returns
PIN_StartProgram();
return 0;
}
I run the pintool with this command: ../../../pin -t obj-intel64/MY_pin.so -- test
where MY_pin.so
is the pintool shared object library and test is the sample code mentioned above.
The output just baffles me because Pin is instrumenting all syscalls except open:
Open Syscall Value: 2
===============================================
This application is instrumented by MyPinTool
===============================================
entered syscall: 12
entered syscall: 158
entered syscall: 21
entered syscall: 257
entered syscall: 5
entered syscall: 9
entered syscall: 3
entered syscall: 257
entered syscall: 0
entered syscall: 17
entered syscall: 17
entered syscall: 17
entered syscall: 5
entered syscall: 9
entered syscall: 17
entered syscall: 17
entered syscall: 17
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 3
entered syscall: 158
entered syscall: 10
entered syscall: 10
entered syscall: 10
entered syscall: 11
entered syscall: 12
entered syscall: 12
entered syscall: 257
entered syscall: 5
entered syscall: 9
entered syscall: 3
entered syscall: 3
As you can see pin instruments all syscalls except open
i.e., syscall number 2 (based on x86_64
ISA).
An interesting observation is that the program doesn't output the cout
from my test program (cout << "fid: " << i << endl;
) which makes me question if Pin is doing something weird with the open syscall?
Specifications:
- Pin version - pin-3.21-98484-e7cd811fd
- gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
- ISA: x86_64
- CPU: AMD Ryzen 7 1700X Eight-Core Processor
Can someone please help me understand why this is happening?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
strace cat foo
向您显示程序不再使用旧的open(2)
系统调用:__NR_openat
是 257,您的 PIN 工具观察3次。显然,甚至open()
libc 包装函数在内部也使用openat
Linux 系统调用。 (__NR_open = 2
系统调用仍然有效;内核还具有将其参数传递给当前实现的代码。IDK 效率更高,就像它可能只是设置一个AT_FDCWD< /code> arg 并调用 sys_openat()
,它必须再次解码它,就像 glibc 在用户空间中所做的那样。)open(2) 手册页还记录了 openat(2)。
openat
/linkat
等等,当与open(O_DIRECTORY)
中的fd
一起使用时,让像find
这样的程序可以避免 TOCTOU 竞争,和/或让多线程程序避免实际chdir
(因为每个进程只有一个 CWD,而不是每个线程。)和
AT_FDCWD
与旧式open(2)
相比没有优点或缺点。strace cat foo
shows you that programs don't use the oldopen(2)
system call anymore:__NR_openat
is 257, which your PIN tool observed 3 times. Apparently even theopen()
libc wrapper function internally uses theopenat
Linux system call. (The__NR_open = 2
system call does still work; the kernel also has code to pass its args to the current implementation. IDK which is more efficient, like maybe it just sets up anAT_FDCWD
arg and callssys_openat()
which has to decode it again, just like glibc does in user-space.)The open(2) man page also documents openat(2).
openat
/linkat
and so on, when used with anfd
fromopen(O_DIRECTORY)
, let programs likefind
avoid TOCTOU races, and/or let multi-threaded programs avoid having to actuallychdir
(because there's only one CWD per process, not per thread.)Using them with
AT_FDCWD
has no advantage or disadvantage vs. old-styleopen(2)
.