简单可加载内核模块上的内核恐慌
我正在尝试为 Ralink 3050 SOC(处理器 MIPS 24KE)上的嵌入式 OpenWRT 系统创建一个基本的可加载内核模块。使用 MODULES=y 和 CONFIG_MODULE_UNLOAD=y 选项构建的目标系统。我在 Ubuntu 20 X86 机器下工作。
#include <linux/init.h>
#include <linux/module.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("test");
int init_module(void) {
printk(KERN_INFO "Hello, world!\n");
return 0;
}
void cleanup_module(void) {
printk(KERN_INFO "Goodbye, world!\n");
}
第一次,我尝试通过内置的 OpenWRT buildroot 交叉编译器来编译模块:
make ARCH=mips CROSS_COMPILE=mipsel-openwrt-linux-musl-
使用了包含 OpenWRT buildroot 的内核头文件。
结果,失败:insmod加载模块后被标记为[永久]。即内核看不到 cleanup_module 过程。我假设这个问题出在文件结构中。我正在检查 vermagic 签名(到另一个可用的系统模块):它匹配。
$modinfo module.ko -> vermagic: 5.10.100 mod_unload MIPS32_R2 32BIT
以防万一,我重建了目标系统,可能是随机错误?但结果是一样的……
然后我决定构建一个自定义的交叉编译器和原始的内核头文件。我已经下载并安装了 binutils 2.38 和 gcc 11.2.0(与 OpenWRT buildroot 相同),并使用相同的 .config 文件下载并准备了内核头文件(内核 5.10.100 – 与目标系统上的相同)。
现在下一个结果:
通过自定义交叉编译器编译 make ARCH=mips CROSS_COMPILE=mipsel-unknown-linux-gnu-
后,模块会加载,但任何下一个调用者 lsmod、rmmod 或 insmod 都会因内核恐慌而完成。看起来,该模块损坏了内核内存。如果模块在没有 cleanup_module 过程的情况下编译,它会正确加载,但状态为 [permanent] 并且无法卸载。
我尝试更改函数声明:使用 __exit 和静态说明符 - 没有结果。
root@(none):/# lsmod
[ 456.197482] CPU 0 Unable to handle kernel paging request at virtual address 034881e0, epc == 80068f78, ra == 80068f64
[ 456.218804] Oops[#1]:
[ 456.223350] CPU: 0 PID: 1017 Comm: lsmod Not tainted 5.10.100 #0
[ 456.235314] $ 0 : 00000000 00000001 00000000 00000001
[ 456.245763] $ 4 : 8160200e 80320b64 00000000 00000001
[ 456.256206] $ 8 : 00000020 00000002 00000001 00000000
[ 456.266645] $12 : 7faff400 77eca2b0 77ec0020 7faff43f
[ 456.277090] $16 : 81a440a4 816110d8 034881d0 80320b64
[ 456.287534] $20 : 80380000 81a441d0 816110f0 81611100
[ 456.297977] $24 : 00000000 80101500
[ 456.308420] $28 : 81a46000 81a47d60 00000002 80068f64
[ 456.318864] Hi : 00000000
[ 456.324594] Lo : 0a3d70a4
[ 456.330333] epc : 80068f78 0x80068f78
[ 456.337973] ra : 80068f64 0x80068f64
[ 456.345609] Status: 1100e403 KERNEL EXL IE
[ 456.353965] Cause : 00800008 (ExcCode 02)
[ 456.361946] BadVA : 034881e0
[ 456.367679] PrId : 0001964c (MIPS 24KEc)
[ 456.375658] Modules linked in: module rt2800soc rt2800mmio rt2800lib rt2x00soc rt2x00mmio rt2x00lib mac80211 cfg80211 crc_ccitt compat sha256_generic libsha256 seqiv jitterentropy_rng drbg hmac cmac leds_gpio crc32c_generic
[ 456.415692] Process lsmod (pid: 1017, threadinfo=1acffaea, task=80d5ed61, tls=77ecbdd4)
[ 456.431637] Stack : 81621700 00400cc0 00000000 00000001 ffffffff 800b6700 81847c80 00002000
[ 456.448345] 00000000 81a47e18 00000000 a050f4de 81847c80 816110d8 81a440a4 00000000
[ 456.465053] 81a47f00 81a47e38 81a47e20 801011d8 00000001 00000001 00000000 00000601
[ 456.481760] 816e2904 800c7f98 77ec1000 81a47e8c 80584860 00000000 00000000 00000000
[ 456.498467] 81847c80 81a47f00 00000400 00000000 00000000 00000003 00000002 80101630
[ 456.515174] ...
[ 456.520050] Call Trace:
[ 456.520071] [<800b6700>] 0x800b6700
[ 456.531895] [<801011d8>] 0x801011d8
[ 456.538848] [<800c7f98>] 0x800c7f98
[ 456.545803] [<80101630>] 0x80101630
[ 456.552751] [<800c1bdc>] 0x800c1bdc
[ 456.559699] [<80018628>] 0x80018628
[ 456.566665] [<800df8f0>] 0x800df8f0
[ 456.573627] [<800faae8>] 0x800faae8
[ 456.580579] [<800de2b0>] 0x800de2b0
[ 456.587531] [<800dfbf0>] 0x800dfbf0
[ 456.594479] [<800c7e2c>] 0x800c7e2c
[ 456.601433] [<8000d36c>] 0x8000d36c
[ 456.611364] Сode: 00001025 10000007 26730b64 <8e460010> 24c6000c 0c040199 02202025 8e520000 24020001
UPD
您是否包含对 module_init() 和 module_exit() 宏的调用 将函数注册为 init/exit?看来你没有从你的 例子。另外,您应该添加 __init 和 __exit 注释。看到这个 我的简单例子。 ——马可·博内利
是的,我尝试使用宏 module_init() 和 module_exit()。没有结果。无论如何,这个宏只是将 init 和 clean 函数名称转换为规范函数名称:init_module 和 cleanup_module,就像用户空间程序中的 main 函数一样。前缀 __init 和 __exit 也没有效果。
听起来就像你使用的编译器的配置不适合 内核。您可以尝试调试“无法处理内核分页 请求”错误。- Tsyvarev
什么是指定用户或内核对象文件类型的“编译器配置”?
如果我们查看(使用 readelf -a 实用程序)我的模块和任何内置模块之间的差异,我们将可以看到 .rel.gnu.linkonce.this_module 部分中 init_module 方法的入口点是相同的,但 cleanup_module 的入口点不同。
Relocation section '.rel.gnu.linkonce.this_module' at offset 0x594 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
000000d0 00001d02 R_MIPS_32 00000000 init_module
00000130 00001c02 R_MIPS_32 00000000 cleanup_module
Relocation section '.rel.gnu.linkonce.this_module' at offset 0x1c8c contains 2 entries:
Offset Info Type Sym.Value Sym. Name
000000d0 00000a02 R_MIPS_32 00000000 _1
00000140 00000902 R_MIPS_32 00000000 _0
I`m trying to create an elementary loadable kernel module to embedded OpenWRT system on the Ralink 3050 SOC (processor MIPS 24KEs). The target system built with MODULES=y and CONFIG_MODULE_UNLOAD=y options. I work under Ubuntu 20 X86 machine.
#include <linux/init.h>
#include <linux/module.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("test");
int init_module(void) {
printk(KERN_INFO "Hello, world!\n");
return 0;
}
void cleanup_module(void) {
printk(KERN_INFO "Goodbye, world!\n");
}
The first time, I`m was trying to compile module by the built-in OpenWRT buildroot cross-compiler:
make ARCH=mips CROSS_COMPILE=mipsel-openwrt-linux-musl-
The kernel headers included the OpenWRT buildroot were used.
Result, fail: after insmod load module was marked as [permanent]. I.e. the cleanup_module procedure was not seen by the kernel. I assumed that problem in a file structure. I`m was checking the vermagic signature (to an other system module that is workable): It matched.
$modinfo module.ko -> vermagic: 5.10.100 mod_unload MIPS32_R2 32BIT
Just in case, I had rebuilt the target system, may be it`s stochastic errors? But the result is the same…
Then I decided to build a custom cross-compiler and original kernel headers. I had downloaded and installed the binutils 2.38 and gcc 11.2.0 (same as at the OpenWRT buildroot) and was downloaded and prepared the kernel header with same .config file (kernel 5.10.100 – same as on the target system).
Now the next result:
After compilation make ARCH=mips CROSS_COMPILE=mipsel-unknown-linux-gnu-
by custom cross compiler the module loads, but any next callers lsmod, rmmod or insmod finished by kernel panic. Looks like, the module corrupts kernel memory. If the module compiled without the cleanup_module procedure, it loads correct but have status [permanent] and can`t be unloaded.
I tried to change the function declaration: used __exit and static specificator – no result.
root@(none):/# lsmod
[ 456.197482] CPU 0 Unable to handle kernel paging request at virtual address 034881e0, epc == 80068f78, ra == 80068f64
[ 456.218804] Oops[#1]:
[ 456.223350] CPU: 0 PID: 1017 Comm: lsmod Not tainted 5.10.100 #0
[ 456.235314] $ 0 : 00000000 00000001 00000000 00000001
[ 456.245763] $ 4 : 8160200e 80320b64 00000000 00000001
[ 456.256206] $ 8 : 00000020 00000002 00000001 00000000
[ 456.266645] $12 : 7faff400 77eca2b0 77ec0020 7faff43f
[ 456.277090] $16 : 81a440a4 816110d8 034881d0 80320b64
[ 456.287534] $20 : 80380000 81a441d0 816110f0 81611100
[ 456.297977] $24 : 00000000 80101500
[ 456.308420] $28 : 81a46000 81a47d60 00000002 80068f64
[ 456.318864] Hi : 00000000
[ 456.324594] Lo : 0a3d70a4
[ 456.330333] epc : 80068f78 0x80068f78
[ 456.337973] ra : 80068f64 0x80068f64
[ 456.345609] Status: 1100e403 KERNEL EXL IE
[ 456.353965] Cause : 00800008 (ExcCode 02)
[ 456.361946] BadVA : 034881e0
[ 456.367679] PrId : 0001964c (MIPS 24KEc)
[ 456.375658] Modules linked in: module rt2800soc rt2800mmio rt2800lib rt2x00soc rt2x00mmio rt2x00lib mac80211 cfg80211 crc_ccitt compat sha256_generic libsha256 seqiv jitterentropy_rng drbg hmac cmac leds_gpio crc32c_generic
[ 456.415692] Process lsmod (pid: 1017, threadinfo=1acffaea, task=80d5ed61, tls=77ecbdd4)
[ 456.431637] Stack : 81621700 00400cc0 00000000 00000001 ffffffff 800b6700 81847c80 00002000
[ 456.448345] 00000000 81a47e18 00000000 a050f4de 81847c80 816110d8 81a440a4 00000000
[ 456.465053] 81a47f00 81a47e38 81a47e20 801011d8 00000001 00000001 00000000 00000601
[ 456.481760] 816e2904 800c7f98 77ec1000 81a47e8c 80584860 00000000 00000000 00000000
[ 456.498467] 81847c80 81a47f00 00000400 00000000 00000000 00000003 00000002 80101630
[ 456.515174] ...
[ 456.520050] Call Trace:
[ 456.520071] [<800b6700>] 0x800b6700
[ 456.531895] [<801011d8>] 0x801011d8
[ 456.538848] [<800c7f98>] 0x800c7f98
[ 456.545803] [<80101630>] 0x80101630
[ 456.552751] [<800c1bdc>] 0x800c1bdc
[ 456.559699] [<80018628>] 0x80018628
[ 456.566665] [<800df8f0>] 0x800df8f0
[ 456.573627] [<800faae8>] 0x800faae8
[ 456.580579] [<800de2b0>] 0x800de2b0
[ 456.587531] [<800dfbf0>] 0x800dfbf0
[ 456.594479] [<800c7e2c>] 0x800c7e2c
[ 456.601433] [<8000d36c>] 0x8000d36c
[ 456.611364] Сode: 00001025 10000007 26730b64 <8e460010> 24c6000c 0c040199 02202025 8e520000 24020001
UPD
Did you include calls to the module_init() and module_exit() macros to
register the functions as init/exit? Seems like you didn't from your
example. Also, you should add __init and __exit annotations. See this
simple example of mine. – Marco Bonelli
Yes, I tried to use the macro module_init() and module_exit(). It has no result. Anyway, this macro just translates the init and clean function name to the canonical function name: init_module and cleanup_module, like the main function in a user space program. Prefix __init and __exit also has no effect.
Smells like you use a compiler which configuration is not suitable for
the kernel. You could try to debug "Unable to handle kernel paging
request" error. – Tsyvarev
What is a "compiler configuration" that specifies the user or kernel object file type?
If we look (using readelf -a utility) at the difference between my module and any built-in module, we will see that the entry points of the init_module method in the .rel.gnu.linkonce.this_module section are identical, but the entry points of the cleanup_module are different.
Relocation section '.rel.gnu.linkonce.this_module' at offset 0x594 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
000000d0 00001d02 R_MIPS_32 00000000 init_module
00000130 00001c02 R_MIPS_32 00000000 cleanup_module
Relocation section '.rel.gnu.linkonce.this_module' at offset 0x1c8c contains 2 entries:
Offset Info Type Sym.Value Sym. Name
000000d0 00000a02 R_MIPS_32 00000000 _1
00000140 00000902 R_MIPS_32 00000000 _0
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论