跟踪挂钩的自修改代码?
我正在寻找将跟踪/日志记录挂钩插入到一些对性能非常敏感的驱动程序代码中的开销最小的方法。这些日志记录的东西必须始终被编译进去,但大多数时候什么也不做(但什么也不做非常快)。
没有什么比拥有一个全局开/关字、执行 if(enabled){log()}
更简单的了。然而,如果可能的话,我什至希望避免每次我碰到一个钩子时加载该单词的成本。我突然想到,我可能会为此使用自修改代码——即在我调用跟踪函数的任何地方,当我想要禁用钩子时,我会用 NOP 覆盖跳转,并在需要时替换跳转来启用它们。
快速谷歌一下并没有找到任何相关的现有技术——有人做到了吗?是否可行?是否存在我未预见到的主要障碍?
(Linux、x86_64)
I'm looking for the least-overhead way of inserting trace/logging hooks into some very performance-sensitive driver code. This logging stuff has to always be compiled in, but most of the time do nothing (but do nothing very fast).
There isn't anything much simpler than just having a global on/off word, doing an if(enabled){log()}
. However, if possible I'd like to even avoid the cost of loading that word every time I hit one of my hooks. It occurs to me that I could potentially use self-modifying code for this -- i.e. everywhere I have a call to my trace function, I overwrite the jump with a NOP when I want to disable the hooks, and replace the jump when I want to enable them.
A quick google doesn't turn up any prior art on this -- has anyone done it? Is it feasible, are there any major stumbling blocks that I'm not foreseeing?
(Linux, x86_64)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
是的,这项技术已经在 Linux 内核中实现,目的完全相同(跟踪钩子)。
请参阅有关跳转标签的 LWN 文章作为起点。
实际上并没有任何主要的障碍,但有一些小障碍:多线程进程(在启用或禁用代码时必须停止所有其他线程);指令缓存不一致(您需要确保每个核心上的指令缓存都被刷新)。
Yes, this technique has been implemented within the Linux kernel, for exactly the same purpose (tracing hooks).
See the LWN article on Jump Labels for a starting point.
There's not really any major stumbling blocks, but a few minor ones: multithreaded processes (you will have to stop all other threads while you're enabling or disabling the code); incoherent instruction cache (you'll need to ensure the I-cache is flushed, on every core).
如果您编译的驱动程序突然变大两倍有关系吗?
构建两条代码路径——一条有日志记录,一条没有日志记录。使用全局函数指针跳转到性能敏感部分,并根据需要覆盖它们。
Does it matter if your compiled driver is suddenly twice as large?
Build two code paths -- one with logging, one without. Use a global function pointer(s) to jump into the performance-sensitive section(s), overwrite them as appropriate.
如果有一种方法可以以某种方式将寄存器声明为全局的,那么您可以在驱动程序的每个入口点从外部将单词的值加载到寄存器中,然后只需检查寄存器即可。当然,那么您将拒绝优化器使用该寄存器,这可能会产生一些令人不快的性能后果。
If there were a way to somehow declare a register global, you could load the register with the value of your word at every entry point into your driver from the outside and then just check the register. Of course, then you'd be denying the use of that register to the optimizer, which might have some unpleasant performance consequences.
我写的并不是关于这是否可能的问题,而是你是否获得了任何重要的东西。
一方面,您不想每次出现日志记录可能性时都测试“日志记录已启用”,另一方面需要测试“日志记录已启用”并用是或否代码覆盖代码。或者您的司机是否“记得”上次没有请求,并且由于这次没有请求,因此无需执行任何操作?
与每次测试相比,必要的逻辑似乎并不微不足道。
I'm writing not so much on the issue of whether this is possible or not but if you gain anything significant.
On the one hand you don't want to test "logging enabled" every time a logging possibility presents itself and on the other need to test "logging enabled" and overwrite code with either the yes- or the no-case code. Or does your driver "remember" that it was no the last time and since no is requested this time nothing needs to be done?
The logic necessary does not appear to be trivial compared to testing every time.