我怎样才能拦截linux系统调用?
除了 LD_PRELOAD 技巧以及用您提供的系统调用替换某个系统调用的 Linux 内核模块之外,是否有可能拦截系统调用(例如 open),以便它在到达实际 open 之前首先通过您的函数?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
除了 LD_PRELOAD 技巧以及用您提供的系统调用替换某个系统调用的 Linux 内核模块之外,是否有可能拦截系统调用(例如 open),以便它在到达实际 open 之前首先通过您的函数?
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(10)
首先,让我们消除其他人给出的一些非答案:
LD_PRELOAD
。 是的,你在问题中说了“除了LD_PRELOAD
...”,但显然这对某些人来说还不够。 这不是一个好的选择,因为它仅在程序使用 libc 时才有效,但情况不一定如此。然而,这里还没有提到其他可能性。 请注意,我对所有这些东西都很陌生,还没有尝试过任何东西,所以我可能对某些事情是错误的。
重写代码
理论上,您可以使用某种自定义加载程序来重写系统调用指令以跳转到自定义处理程序。 但我认为实施起来绝对是一场噩梦。
kprobes
kprobes 是某种内核检测系统。 它们只有对任何内容的只读访问权限,因此您不能使用它们来拦截系统调用,而只能记录它们。
ptrace
ptrace 是 GDB 等调试器用来进行调试的 API。 有一个
PTRACE_SYSCALL
选项,它将在系统调用之前/之后暂停执行。 从那里您可以像 GDB 一样做几乎任何您喜欢的事情。 这是一篇有关如何使用 ptrace 修改系统调用参数的文章。 然而它显然有很高的开销。Seccomp
Seccomp 是一个系统,旨在允许您过滤系统调用。 您无法修改参数,但可以阻止它们或返回自定义错误。 Seccomp 过滤器是 BPF 程序。 如果您不熟悉,它们基本上是用户可以在内核空间虚拟机中运行的任意程序。 这避免了用户/内核上下文切换,这使得它们比 ptrace 更快。
虽然您无法直接从 BPF 程序修改参数,但您可以返回 SECCOMP_RET_TRACE ,这将触发 ptrace 父进程中断。 因此它基本上与 PTRACE_SYSCALL 相同,只是您需要在内核空间中运行一个程序来决定是否要根据其参数实际拦截系统调用。 因此,如果您只想拦截一些系统调用(例如具有特定路径的
open()
),它应该会更快。我认为这可能是最好的选择。 这里有一篇关于它的文章,作者与上述文章相同。
请注意,他们使用经典 BPF 而不是 eBPF,但我想你也可以使用 eBPF。编辑:实际上你只能使用经典 BPF,而不能使用 eBPF。 有一篇关于它的 LWN 文章。
以下是一些相关问题。 第一篇绝对值得一读。
还有一篇关于通过 ptrace 操作系统调用的好文章此处 。
First lets eliminate some non-answers that other people have given:
LD_PRELOAD
. Yeah you said "BesidesLD_PRELOAD
..." in the question but apparently that isn't enough for some people. This isn't a good option because it only works if the program uses libc which isn't necessarily the case.However there are other possibilities not mentioned here yet. Note I'm new to all this stuff and haven't tried any of it yet so I may be wrong about some things.
Rewrite the code
In theory you could use some kind of custom loader that rewrites the syscall instructions to jump to a custom handler instead. But I think that would be an absolute nightmare to implement.
kprobes
kprobes are some kind of kernel instrumentation system. They only have read-only access to anything so you can't use them to intercept syscalls, only log them.
ptrace
ptrace is the API that debuggers like GDB use to do their debugging. There is a
PTRACE_SYSCALL
option which will pause execution just before/after syscalls. From there you can do pretty much whatever you like in the same way that GDB can. Here's an article about how to modify syscall paramters using ptrace. However it apparently has high overhead.Seccomp
Seccomp is a system that is design to allow you to filter syscalls. You can't modify the arguments, but you can block them or return custom errors. Seccomp filters are BPF programs. If you're not familiar, they are basically arbitrary programs that users can run in a kernel-space VM. This avoids the user/kernel context switch which makes them faster than ptrace.
While you can't modify arguments directly from your BPF program you can return
SECCOMP_RET_TRACE
which will trigger aptrace
ing parent to break. So it's basically the same asPTRACE_SYSCALL
except you get to run a program in kernel space to decide whether you want to actually intercept a syscall based on its arguments. So it should be faster if you only want to intercept some syscalls (e.g.open()
with specific paths).I think this is probably the best option. Here's an article about it from the same author as the one above.
Note they use classic BPF instead of eBPF but I guess you can use eBPF too.Edit: Actually you can only use classic BPF, not eBPF. There's a LWN article about it.
Here are some related questions. The first one is definitely worth reading.
There's also a good article about manipulating syscalls via ptrace here.
为什么你不能/不想使用LD_PRELOAD技巧?
这里的示例代码:
据我了解...这几乎是 LD_PRELOAD 技巧或内核模块。 没有太多的中间地带,除非您想在模拟器下运行它,该模拟器可以捕获您的函数,或者在实际的二进制文件上进行代码重写以捕获您的函数。
假设您无法修改程序并且无法(或不想)修改内核,那么 LD_PRELOAD 方法是最好的方法,假设您的应用程序相当标准,并且实际上不是恶意试图绕过的方法你的拦截。 (在这种情况下,您将需要其他技术之一。)
Why can't you / don't want to use the LD_PRELOAD trick?
Example code here:
From what I understand... it is pretty much the LD_PRELOAD trick or a kernel module. There's not a whole lot of middle ground unless you want to run it under an emulator which can trap out to your function or do code re-writing on the actual binary to trap out to your function.
Assuming you can't modify the program and can't (or don't want to) modify the kernel, the LD_PRELOAD approach is the best one, assuming your application is fairly standard and isn't actually one that's maliciously trying to get past your interception. (In which case you will need one of the other techniques.)
Valgrind 可用于拦截任何函数调用。 如果您需要在成品中拦截系统调用,那么这将没有用。 但是,如果您在开发过程中尝试拦截,那么它会非常有用。 我经常使用这种技术来拦截哈希函数,以便我可以控制返回的哈希值以进行测试。
如果您不知道,Valgrind 主要用于查找内存泄漏和其他与内存相关的错误。 但底层技术基本上是 x86 模拟器。 它模拟您的程序并拦截对 malloc/free 等的调用。好处是,您不需要重新编译即可使用它。
Valgrind 有一个称为“函数包装”的功能,用于控制函数的拦截。 有关详细信息,请参阅 Valgrind 手册的第 3.2 节。 您可以为任何您喜欢的函数设置函数包装。 一旦调用被拦截,您提供的替代函数就会被调用。
Valgrind can be used to intercept any function call. If you need to intercept a system call in your finished product then this will be no use. However, if you are try to intercept during development then it can be very useful. I have frequently used this technique to intercept hashing functions so that I can control the returned hash for testing purposes.
In case you are not aware, Valgrind is mainly used for finding memory leaks and other memory related errors. But the underlying technology is basically an x86 emulator. It emulates your program and intercepts calls to malloc/free etc. The good thing is, you do not need to recompile to use it.
Valgrind has a feature that they term Function Wrapping, which is used to control the interception of functions. See section 3.2 of the Valgrind manual for details. You can setup function wrapping for any function you like. Once the call is intercepted the alternative function that you provide is then invoked.
如果您只想观察打开的内容,则需要查看 ptrace() 函数或命令行 strace 实用程序的源代码。 如果您确实想拦截调用,也许让它做其他事情,我认为您列出的选项 - LD_PRELOAD 或内核模块 - 是您唯一的选择。
If you just want to watch what's opened, you want to look at the ptrace() function, or the source code of the commandline strace utility. If you actually want to intercept the call, to maybe make it do something else, I think the options you listed - LD_PRELOAD or a kernel module - are your only options.
我没有使用 LKM 来优雅地完成此操作的语法,但本文很好地概述了您需要执行的操作: http://www.linuxjournal.com/article/4378
您也可以只修补 sys_open 函数。 从 linux-2.6.26 开始,它从 file/open.c 的第 1084 行开始。
您可能还会看到是否无法使用 inotify、systemtap 或 SELinux 来为您完成所有这些日志记录,而无需构建新系统。
I don't have the syntax to do this gracefully with an LKM offhand, but this article provides a good overview of what you'd need to do: http://www.linuxjournal.com/article/4378
You could also just patch the sys_open function. It starts on line 1084 of file/open.c as of linux-2.6.26.
You might also see if you can't use inotify, systemtap or SELinux to do all this logging for you without you having to build a new system.
有些应用程序可以欺骗 strace/ptrace 使其不运行,所以我唯一真正的选择是使用 systemtap
Systemtap 可以拦截一堆系统调用(如果需要的话),因为它的通配符匹配。 Systemtap 不是 C,而是一种独立的语言。 在基本模式下,systemtap 应该防止您做愚蠢的事情,但它也可以在“专家模式”下运行,如果需要,可以回退到允许开发人员使用 C。
它不需要您修补内核(或者至少不应该),并且一旦编译了模块,您就可以从测试/开发盒中复制它并将其插入(通过 insmod)到生产系统上。
我还没有找到一个 Linux 应用程序能够找到解决/避免被 systemtap 捕获的方法。
Some applications can trick strace/ptrace not to run, so the only real option I've had is using systemtap
Systemtap can intercept a bunch of system calls if need be due to its wild card matching. Systemtap is not C, but a separate language. In basic mode, the systemtap should prevent you from doing stupid things, but it also can run in "expert mode" that falls back to allowing a developer to use C if that is required.
It does not require you to patch your kernel (Or at least shouldn't), and once a module has been compiled, you can copy it from a test/development box and insert it (via insmod) on a production system.
I have yet to find a linux application that has found a way to work around/avoid getting caught by systemtap.
听起来你需要auditd。
Auditd 允许通过日志记录对所有系统调用或文件访问进行全局跟踪。 您可以为您感兴趣的特定事件设置键。
Sounds like you need auditd.
Auditd allows global tracking of all syscalls or accesses to files, with logging. You can set keys for specific events that you are interested in.
如果您确实需要一个解决方案,您可能会对实现此目的的 DR rootkit 感兴趣,http:// /www.immunityinc.com/downloads/linux_rootkit_source.tbz2 关于它的文章在这里 http://www.theregister.co.uk/2008/09/04/linux_rootkit_released/
if you really need a solution you might be interested in the DR rootkit that accomplishes just this, http://www.immunityinc.com/downloads/linux_rootkit_source.tbz2 the article about it is here http://www.theregister.co.uk/2008/09/04/linux_rootkit_released/
如果您只想出于调试目的进行此操作,请查看 strace,它内置于 ptrace(2) 系统调用之上,它允许您在系统调用完成时挂接代码。 请参阅手册页的 PTRACE_SYSCALL 部分。
If you just want to do it for debugging purposes look into strace, which is built in top of the ptrace(2) system call which allows you to hook up code when a system call is done. See the PTRACE_SYSCALL part of the man page.
使用 SystemTap 可能是一种选择。
对于 Ubuntu,请按照 https://wiki.ubuntu.com/Kernel/Systemtap。
然后只需执行以下命令,您将监听所有
openat
系统调用:Using SystemTap may be an option.
For Ubuntu, install it as indicated in https://wiki.ubuntu.com/Kernel/Systemtap.
Then just execute the following and you will be listening on all
openat
syscalls: