Linux 内核模块:何时使用 try_module_get / module_put
我正在阅读 LKMPG(请参阅第 4.1.4 节。取消注册设备 )并且我不清楚何时使用 try_module_get / module_put
函数。有些 LKMPG 示例使用它们,有些则不使用。
更令人困惑的是,try_module_get
在 2.6.24 源代码的 193 个文件中出现了 282 次,但在 Linux 设备驱动程序 (LDD3) 和基本 Linux 设备驱动程序,它们甚至出现在单个代码示例。
我想也许它们与旧的 register_chrdev
接口相关(在 2.6 中被 cdev 接口取代),但它们只在同一个文件中一起出现 8 次:
find -type f -name *.c | xargs grep -l try_module_get | sort -u | xargs grep -l register_chrdev | sort -u | grep -c .
那么什么时候适合使用这些函数和它们是否与特定界面或一组环境的使用相关?
编辑
我加载了sched.c 来自 LKMPG 的示例并尝试了以下实验:
anon@anon:~/kernel-source/lkmpg/2.6.24$ tail /proc/sched -f &
Timer called 5041 times so far
[1] 14594
anon@anon:~$ lsmod | grep sched
sched 2868 1
anon@anon:~$ sudo rmmod sched
ERROR: Module sched is in use
这使我相信内核现在有自己的会计,并且获取/放置可能已过时。任何人都可以验证这一点吗?
I was reading the LKMPG ( See Section 4.1.4. Unregistering A Device ) and it wasn't clear to me when to use the try_module_get / module_put
functions. Some of the LKMPG examples use them, some don't.
To add to the confusion, try_module_get
appears 282 times in 193 files in the 2.6.24 source, yet in Linux Device Drivers ( LDD3 ) and Essential Linux Device Drivers, they appears in not even a single code example.
I thought maybe they were tied to the old register_chrdev
interface ( superseded in 2.6 by the cdev interface ), but they only appear together in the same files 8 times:
find -type f -name *.c | xargs grep -l try_module_get | sort -u | xargs grep -l register_chrdev | sort -u | grep -c .
So when is it appropriate to use these functions and are they tied to the use of a particular interface or set of circumstances?
Edit
I loaded the sched.c example from the LKMPG and tried the following experiment:
anon@anon:~/kernel-source/lkmpg/2.6.24$ tail /proc/sched -f &
Timer called 5041 times so far
[1] 14594
anon@anon:~$ lsmod | grep sched
sched 2868 1
anon@anon:~$ sudo rmmod sched
ERROR: Module sched is in use
This leads me to believe that the kernel now does it's own accounting and the gets / puts may be obsolete. Can anyone verify this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您基本上永远不必使用 try_module_get(THIS_MODULE);几乎所有此类使用都是不安全的,因为如果您已经在模块中,那么再增加引用计数就为时已晚了——总会有一个(小)窗口,您在模块中执行代码但尚未增加引用数数。如果有人恰好在该窗口中删除了模块,那么您将处于在卸载的模块中运行代码的糟糕境地。
您在 LKMPG 中链接的特定示例,其中代码在 open() 方法中执行 try_module_get() 将通过在 struct file_operations 中设置 .owner 字段在现代内核中进行处理:
这将使 VFS 代码引用模块 < em>before调用它,这消除了不安全的窗口 - 要么 try_module_get() 在调用 .open() 之前成功,要么 try_module_get() 将失败并且 VFS 永远不会调用该模块。无论哪种情况,我们都不会从已经卸载的模块中运行代码。
使用 try_module_get() 的唯一好时机是当您想要在调用一个不同模块之前引用它或以某种方式使用它时(例如,就像示例中的文件打开代码一样)上面解释过)。内核源代码中有很多 try_module_get(THIS_MODULE) 的用途,但大多数(如果不是全部)都是应该清除的潜在错误。
您无法卸载 sched 示例的原因是您的
命令使 /proc/sched 保持打开状态,并且由于
在 sched.c 代码中,打开 /proc/sched 会增加 sched 模块的引用计数,这会导致您的 lsmod 显示的 1 个引用。快速浏览其余代码,我认为如果您通过终止 tail 命令来释放 /proc/sched,您将能够删除 sched 模块。
You should essentially never have to use try_module_get(THIS_MODULE); pretty much all such uses are unsafe since if you are already in your module, it's too late to bump the reference count -- there will always be a (small) window where you are executing code in your module but haven't incremented the reference count. If someone removes the module exactly in that window, then you're in the bad situation of running code in an unloaded module.
The particular example you linked in LKMPG where the code does try_module_get() in the open() method would be handled in the modern kernel by setting the .owner field in struct file_operations:
this will make the VFS code take a reference to the module before calling into it, which eliminates the unsafe window -- either the try_module_get() will succeed before the call to .open(), or the try_module_get() will fail and the VFS will never call into the module. In either case, we never run code from a module that has already been unloaded.
The only good time to use try_module_get() is when you want to take a reference on a different module before calling into it or using it in some way (eg as the file open code does in the example I explained above). There are a number of uses of try_module_get(THIS_MODULE) in the kernel source but most if not all of them are latent bugs that should be cleaned up.
The reason you were not able to unload the sched example is that your
command keeps /proc/sched open, and because of
in the sched.c code, opening /proc/sched increments the reference count of the sched module, which accounts for the 1 reference that your lsmod shows. From a quick skim of the rest of the code, I think if you release /proc/sched by killing your tail command, you would be able to remove the sched module.