跟踪未初始化的静态变量

发布于 2024-12-10 18:46:14 字数 375 浏览 1 评论 0原文

我需要调试一个丑陋且巨大的数学 C 库，可能曾经由 f2c 生成。该代码滥用了本地静态变量，不幸的是，它似乎在某个地方利用了这些变量自动初始化为 0 的事实。如果使用相同的输入调用其入口函数两次，则会给出不同的结果。如果我卸载该库并再次重新加载，它将正常工作。它需要很快，所以我想摆脱加载/卸载。

我的问题是如何使用 valgrind 或任何其他工具发现这些错误，而无需手动遍历整个代码。

我正在寻找声明局部静态变量的地方，首先读取，然后才写入。由于静态变量有时通过指针进一步传递（是的 - 它太难看了），问题变得更加复杂。

我理解有人可能会说这样的错误不一定需要由自动工具检测到，因为在某些情况下这正是预期的行为。不过，有没有办法让自动初始化的局部静态变量变得“脏”？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

安静被遗忘 2024-12-17 18:46:14

细节决定成败，但这可能对您有用：

首先，获取 Frama-C。如果您使用的是 Unix，您的发行版可能有一个软件包。该软件包不会是最后一个版本，但它可能已经足够好，并且如果您以这种方式安装它，它将节省您一些时间。

假设您的示例如下所示，只是太大了以至于不明显出了什么问题：

int add(int x, int y)
{
  static int state;
  int result = x + y + state; // I tested it once and it worked.
  state++;
  return result;
}

键入如下命令：

frama-c -lib-entry -main add -deps ugly.c

Options -lib-entry -main add 意思是“查看函数add”。选项-deps计算函数依赖关系。您将在日志中找到这些“功能依赖项”：

[from] Function add:
     state FROM state; (and default:false)
     \result FROM x; y; state; (and default:false)

这列出了 add 结果所依赖的实际输入，以及实际输出根据这些输入计算，包括读取和修改的静态变量。在使用之前正确初始化的静态变量通常不会作为输入出现，除非分析器无法确定它在读取之前始终已初始化。

日志将 state 显示为 \result 的依赖项。如果您期望返回的结果仅取决于参数（意味着具有相同参数的两次调用会产生相同的结果），则暗示变量 state 可能存在问题。

上面几行显示的另一个提示是该函数修改state。

这可能有帮助，也可能没有帮助。选项 -lib-entry 意味着分析器不会假设任何非常量静态变量在调用分析的函数时保持其值，因此这对于您的代码来说可能太不精确。有很多方法可以解决这个问题，但这取决于你是否愿意花时间去学习这些方法。

编辑：这是一个更复杂的示例：

void initialize_1(int *p)
{
  *p = 0;
}

void initialize_2(int *p)
{
  *p; // I made a mistake here.
}

int add(int x, int y)
{
  static int state1;
  static int state2;

  initialize_1(&state1);
  initialize_2(&state2);

  // This is safe because I have initialized state1 and state2:
  int result = x + y + state1 + state2; 

  state1++;
  state2++;
  return result;
}

在此示例中，相同的命令生成结果：

[from] Function initialize_1:
         state1 FROM p
[from] Function initialize_2:
[from] Function add:
         state1 FROM \nothing
         state2 FROM state2
         \result FROM x; y; state2

您看到的 initialize_2 是一个空的依赖项列表，这意味着该函数不分配任何内容。我将通过显示明确的消息而不仅仅是空列表来使这种情况更清楚。如果您知道 initialize_1、initialize_2 或 add 函数的用途，您可以将此先验知识与以下结果进行比较分析后发现 initialize_2 和 add 有问题。

第二次编辑：现在我的示例显示了 initialize_1 的一些奇怪之处，所以也许我应该解释一下。变量 state1 取决于 p，因为 p 用于写入 state1，并且如果 p 不同，那么 state1 的最终值也会不同。这是最后一个示例：

int t[10];

void initialize_index(int i)
{
  t[i] = 1;
}

int main(int argc, char **argv)
{
  initialize_index(argv[1][0]-'0');
}

使用命令 frama-c -deps tc，为 initialize_index 计算的依赖关系为：

[from] Function initialize_index:
         t[0..9] FROM i (and SELF)

这意味着每个单元都依赖于 i （如果i 是该特定单元格的索引，则可以对其进行修改）。每个单元格还可以保留其值（如果i表示另一个单元格）：这在最新版本中用（和SELF）提及来指示，并用更多指示在以前的版本中晦涩难懂的（默认值：true）。

The devil is in the details, but this may work for you:

First, get Frama-C. If you are using Unix, your distribution may have a package. The package won't be the last version but it may be good enough and it will save you some time if you install it this way.

Say your example is as below, only so much bigger that it's not obvious what is wrong:

int add(int x, int y)
{
  static int state;
  int result = x + y + state; // I tested it once and it worked.
  state++;
  return result;
}

Type a command like:

frama-c -lib-entry -main add -deps ugly.c

Options -lib-entry -main add mean "look at function add". Option -deps computes functional dependencies. You'll find these "functional dependencies" in the log:

[from] Function add:
     state FROM state; (and default:false)
     \result FROM x; y; state; (and default:false)

This lists the actual inputs the results of add depend on, and the actual outputs computed from these inputs, including static variables read from and modified. A static variable that was properly initialized before being used would normally not appear as input, unless the analyzer was unable to determine that it was always initialized before being read from.

The log shows state as dependency of \result. If you expected the returned result to depend only on the arguments (meaning two calls with the same arguments produce the same result), it's a hint there may be something wrong here, with the variable state.

Another hint shown in the above lines is that the function modifies state.

This may help or not. Option -lib-entry means that the analyzer does not assume that any non-const static variable has kept its value at the time the function under analysis is called, so that may be too imprecise for your code. There are ways around that, but then it is up to you whether you want to gamble the time it takes to learn these ways.

EDIT: here is a more complex example:

void initialize_1(int *p)
{
  *p = 0;
}

void initialize_2(int *p)
{
  *p; // I made a mistake here.
}

int add(int x, int y)
{
  static int state1;
  static int state2;

  initialize_1(&state1);
  initialize_2(&state2);

  // This is safe because I have initialized state1 and state2:
  int result = x + y + state1 + state2; 

  state1++;
  state2++;
  return result;
}

On this example, the same command produces the results:

[from] Function initialize_1:
         state1 FROM p
[from] Function initialize_2:
[from] Function add:
         state1 FROM \nothing
         state2 FROM state2
         \result FROM x; y; state2

What you see for initialize_2 is an empty list of dependencies, meaning the function assigns nothing. I will make this case clearer by displaying an explicit message rather than just an empty list. If you know what any of the functions initialize_1, initialize_2 or add is supposed to do, you can compare this a priori knowledge to the results of the analysis and see that something is wrong for initialize_2 and add.

SECOND EDIT: and now my example shows something strange for initialize_1, so perhaps I should explain that. Variable state1 depends on p in the sense that p is used to write to state1, and if p had been different, then the final value of state1 would have been different. Here is a last example:

int t[10];

void initialize_index(int i)
{
  t[i] = 1;
}

int main(int argc, char **argv)
{
  initialize_index(argv[1][0]-'0');
}

With the command frama-c -deps t.c, the dependencies computed for initialize_index are:

[from] Function initialize_index:
         t[0..9] FROM i (and SELF)

This means that each of the cells depends on i (it may be modified if i is the index of that particular cell). Each cell may also keep its value (if i indicates another cell): this is indicated with the (and SELF) mention in the latest version, and was indicated with a more obscure (and default:true) in previous versions.

回复收藏 0 原文