Linux 内核模块中 printk 的奇怪行为

发布于 2024-09-30 13:57:44 字数 1036 浏览 2 评论 0原文

我正在为 Linux 内核模块编写代码，并遇到了奇怪的行为。这是我的代码：

int data = 0;
void threadfn1()
{
    int j;
    for( j = 0; j < 10; j++ )
        printk(KERN_INFO "I AM THREAD 1 %d\n",j);   
    data++;
}

void threadfn2()
{
    int j;
    for( j = 0; j < 10; j++ )
        printk(KERN_INFO "I AM THREAD 2 %d\n",j);
    data++; 
}
static int __init abc_init(void)
{
        struct task_struct *t1 = kthread_run(threadfn1, NULL, "thread1");
        struct task_struct *t2 = kthread_run(threadfn2, NULL, "thread2");
        while( 1 )
        {
        printk("debug\n"); // runs ok
            if( data >= 2 )
            {
                kthread_stop(t1);
                kthread_stop(t2);
                break;
            }
        }
        printk(KERN_INFO "HELLO WORLD\n");

 }

基本上我试图等待线程完成，然后打印一些东西。上面的代码确实实现了该目标，但没有注释 "printk("debug\n");" 。一旦我注释掉 printk("debug\n"); 以在不调试的情况下运行代码并通过 insmod 命令加载模块，该模块就会挂起，并且似乎在递归中丢失了。我不明白为什么 printk 对我的代码影响这么大？

任何帮助将不胜感激。

问候。

原文

I am writing a code for linux kernel module and experiencing a strange behavior in it.
Here is my code:

int data = 0;
void threadfn1()
{
    int j;
    for( j = 0; j < 10; j++ )
        printk(KERN_INFO "I AM THREAD 1 %d\n",j);   
    data++;
}

void threadfn2()
{
    int j;
    for( j = 0; j < 10; j++ )
        printk(KERN_INFO "I AM THREAD 2 %d\n",j);
    data++; 
}
static int __init abc_init(void)
{
        struct task_struct *t1 = kthread_run(threadfn1, NULL, "thread1");
        struct task_struct *t2 = kthread_run(threadfn2, NULL, "thread2");
        while( 1 )
        {
        printk("debug\n"); // runs ok
            if( data >= 2 )
            {
                kthread_stop(t1);
                kthread_stop(t2);
                break;
            }
        }
        printk(KERN_INFO "HELLO WORLD\n");

 }

Basically I was trying to wait for threads to finish and then print something after that.
The above code does achieve that target but WITH "printk("debug\n");" not commented. As soon as I comment out printk("debug\n"); to run the code without debugging and load the module through insmod command, the module hangs on and it seems like it gets lost in recursion. I dont why printk effects my code in such a big way?

Any help would be appreciated.

regards.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

清醇 2024-10-07 13:57:44

您没有同步对数据变量的访问。发生的情况是，编译器将生成无限循环。原因如下：

  while( 1 )
        {
            if( data >= 2 )
            {
                kthread_stop(t1);
                kthread_stop(t2);
                break;
            }
        }

编译器可以检测到 data 的值在 while 循环内永远不会改变。因此，它可以完全将检查移出循环，并且最终会得到一个简单的结果。

 while (1) {}

如果插入 printk，编译器必须假设全局变量数据可能会更改（毕竟 - 编译器不知道 printk 详细执行什么操作））因此您的代码将再次开始工作（以未定义的行为方式......）

如何解决此问题：

使用正确的线程同步原语。如果将对数据的访问包装到受互斥锁保护的代码部分中，则代码将起作用。您还可以替换变量数据并使用计数信号量。

编辑：

此链接解释了 linux 内核中的锁定如何工作：

http://www.linuxgrill.com/anonymous/fire/netfilter/kernel-hacking-HOWTO-5.html

You're not synchronizing the access to the data-variable. What happens is, that the compiler will generate a infinite loop. Here is why:

  while( 1 )
        {
            if( data >= 2 )
            {
                kthread_stop(t1);
                kthread_stop(t2);
                break;
            }
        }

The compiler can detect that the value of data never changes within the while loop. Therefore it can completely move the check out of the loop and you'll end up with a simple

 while (1) {}

If you insert printk the compiler has to assume that the global variable data may change (after all - the compiler has no idea what printk does in detail) therefore your code will start to work again (in a undefined behavior kind of way..)

How to fix this:

Use proper thread synchronization primitives. If you wrap the access to data into a code section protected by a mutex the code will work. You could also replace the variable data and use a counted semaphore instead.

Edit:

This link explains how locking in the linux-kernel works:

http://www.linuxgrill.com/anonymous/fire/netfilter/kernel-hacking-HOWTO-5.html

回复收藏 0 原文

燃情 2024-10-07 13:57:44

删除对 printk() 的调用后，编译器将循环优化为 while (1);。当您添加对 printk() 的调用时，编译器不确定 data 是否未更改，因此每次循环时都会检查该值。

您可以在循环中插入一个屏障，这会强制编译器在每次迭代时重新评估data。例如：

while (1) {
        if (data >= 2) {
                kthread_stop(t1);
                kthread_stop(t2);
                break;
        }

        barrier();
}

With the call to printk() removed the compiler is optimising the loop into while (1);. When you add the call to printk() the compiler is not sure that data isn't changed and so checks the value each time through the loop.

You can insert a barrier into the loop, which forces the compiler to reevaluate data on each iteration. eg:

while (1) {
        if (data >= 2) {
                kthread_stop(t1);
                kthread_stop(t2);
                break;
        }

        barrier();
}

回复收藏 0 原文

强者自强 2024-10-07 13:57:44

也许数据应该声明为易失性的？编译器可能不会在循环中访问内存来获取数据。

回复收藏 0 原文

凡尘雨 2024-10-07 13:57:44

尼尔斯·皮彭布林克（Nils Pipenbrinck）的回答是正确的。我只是添加一些指示。

Rusty 的不可靠内核锁定指南 (每个内核黑客都应该阅读这篇文章）。

再见信号量？，互斥量 API（lwn.net 文章介绍了新的互斥量 API 2006 年初，在此之前 Linux 内核使用信号量作为互斥体）。

另外，由于您的共享数据是一个简单的计数器，因此您只需使用原子 API（基本上，将计数器声明为atomic_t 并使用atomic_* 函数访问它）。

回复收藏 0 原文

何处潇湘 2024-10-07 13:57:44

波动性并不总是“坏主意”。一个人需要分开
需要 volatility 和互斥时的情况
需要机制。当使用或误用时，它不是最佳的
一种机制适用于另一种机制。在上述情况下。我建议
为了获得最佳解决方案，需要两种机制：互斥
提供互斥，易失性以向编译器表明
“信息”必须从硬件中新鲜读取。否则，在某些
情况（优化-O2、-O3），编译器可能会无意中
省略所需的代码。