程序的有效状态域是常规语言吗?

发布于 2024-08-29 01:45:21 字数 1008 浏览 9 评论 0 原文

如果您查看程序的调用堆栈并将每个返回指针视为令牌,需要什么样的自动机来构建程序有效状态的识别器?

作为推论, >需要什么样的自动机来为特定的错误状态构建识别器?

(注意:我只查看可以从 这个函数。)


我的想法是,如果这些形成常规语言,那么可以围绕它构建一些有趣的工具。例如,给定一组崩溃/故障转储,自动将它们分组并生成识别器以识别已知错误的新实例。


注意:我并不是建议将其作为诊断工具,而是作为数据管理工具,将一堆崩溃报告变成更有用的东西。

  • “这 54 起事故似乎是相关的,那 42 起事故也是如此。”
  • “这些新的崩溃似乎与 X 日期之前的任何事情都无关。”
  • 假设

看来我还不清楚我想要完成什么,所以这里有一个例子:

你有一个程序,其中有三个错误。

  • 两个错误导致无效参数传递给单个函数,从而触发相同的健全性检查。
  • 如果给定(有效)极端情况,该函数将进入无限递归。

另外,当程序崩溃(断言失败、未捕获的异常、seg-V、堆栈溢出等)时,它会抓取堆栈跟踪,提取其中的调用站点并将它们发送到 QA 报告服务器。 (我假设只提取该信息,因为 1,每个项目一次的成本很容易获得,2,它具有简单、明确的含义,无需任何有关该程序的特殊知识即可使用)

我的内容我建议开发一种工具,尝试将传入的报告分类为与已知错误之一(或新错误)有关。

最简单的事情是假设一个故障站点是一个错误,但在第一个示例中,在同一位置检测到两个错误。下一个最简单的事情是要求整个堆栈匹配,但同样,这在像第二个示例这样的情况下不起作用,在这种情况下,您有多段(有效的)有效代码可以引发相同的错误。

If you look at the call stack of a program and treat each return pointer as a token, what kind of automata is needed to build a recognizer for the valid states of the program?

As a corollary, what kind of automata is needed to build a recognizer for a specific bug state?

(Note: I'm only looking at the info that could be had from this function.)


My thought is that if these form regular languages than some interesting tools could be built around that. E.g. given a set of crash/failure dumps, automatically group them and generate a recognizer to identify new instances of know bugs.


Note: I'm not suggesting this as a diagnostic tool but as a data management tool for turning a pile of crash reports into something more useful.

  • "These 54 crashes seem related, as do those 42."
  • "These new crashes seem unrelated to anything before date X."
  • etc.

It would seem that I've not been clear about what I'm thinking of accomplishing, so here's an example:

Say you have a program that has three bugs in it.

  • Two bugs that cause invalid args to be passed to a single function tripping the same sanity check.
  • A function that if given a (valid) corner case goes into an infinite recursion.

Also as that when the program crashes (failed assert, uncaught exception, seg-V, stack overflow, etc.) it grabs a stack trace, extracts the call sites on it and ships them to a QA reporting server. (I'm assuming that only that information is extracted because 1, it's easy to get with a one time per project cost and 2, it has a simple, definite meaning that can be used without any special knowledge about the program)

What I'm proposing would be a tool that would attempt to classify incoming reports as connected to one of the known bugs (or as a new bug).

The simplest thing would be to assume that one failure site is one bug, but in the first example, two bugs get detected in the same place. The next easiest thing would be to require the entire stack to match, but again, this doesn't work in cases like the second example where you have multiple pieces of (valid) valid code that can trip the same bug.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

花辞树 2024-09-05 01:45:21

堆栈上的返回指针只是指向内存的指针。理论上,如果您查看仅进行一个函数调用的程序的调用堆栈,则返回指针(对于该函数)对于程序的每次执行都可以具有不同的值。你会如何分析?

理论上,您可以使用映射文件读取核心转储。但这样做是非常特定于平台和编译器的。您将无法创建一个通用工具来使用任何程序执行此操作。阅读编译器的文档,看看它是否包含任何用于事后分析的工具。

The return pointer on the stack is just a pointer to memory. In theory if you look at the call stack of a program that just makes one function call, the return pointer (for that one function) can have different value for every execution of the program. How would you analyze that?

In theory you could read through a core dump using a map file. But doing so is extremely platform and compiler specific. You would not be able to create a general tool for doing this with any program. Read your compiler's documentation to see if it includes any tools for doing postmortem analysis.

玻璃人 2024-09-05 01:45:21

如果您的程序用断言语句修饰,则每个断言语句都定义一个有效状态。断言之间的程序语句定义有效的状态更改。

崩溃的程序已经违反了足够多的断言,导致某些东西被破坏。

一个不正确但“不稳定”的程序至少违反了一项断言,但并未失败。

根本不清楚你在寻找什么。有效状态有时很难定义,但通常很容易表示为简单的 assert 语句。

由于崩溃的程序违反了一个或多个断言,因此具有显式可执行断言的程序不需要崩溃调试。它只会使断言语句失败并明显死亡。

如果您不想放入断言语句,那么基本上不可能知道什么状态应该为真以及哪个(从未实际声明的)断言被违反。

展开调用堆栈来计算位置和嵌套是微不足道的。但尚不清楚这表明了什么。它告诉您什么损坏了,但没有告诉您其他什么原因导致损坏。这需要猜测哪些断言应该是正确的,这需要对设计有深入的了解。


编辑。

如果不求助于实际应用程序的实际设计以及每个堆栈帧中应该为真的实际断言,“似乎相关”和“似乎不相关”是无法定义的。

如果您不知道断言应该为真,那么您所拥有的只是一堆随机的变量。给定一堆随机值,你能断言“相关”是什么?

Crash 1: a = 2, b = 3, c = 4 

Crash 2: a = 3, b = 4, c = 5 

有关的?无关?在不了解代码的一切的情况下,如何对它们进行分类?如果您了解有关代码的一切,则可以制定应为为真的标准断言语句条件。然后你就知道真正的崩溃是什么了。

If your program is decorated with assert statements, then each assert statement defines a valid state. The program statements between the assertions define the valid state changes.

A program that crashes has violated enough assertions that something broken.

A program that's incorrect but "flaky" has violated at least one assertion but hasn't failed.

It's not at all clear what you're looking for. The valid states are -- sometimes -- hard to define but -- usually -- easy to represent as simple assert statements.

Since a crashed program has violated one or more assertions, a program with explicit, executable assertions, doesn't need an crash debugging. It will simply fail an assert statement and die visibly.

If you don't want to put in assert statements then it's essentially impossible to know what state should have been true and which (never-actually-stated) assertion was violated.

Unwinding the call stack to work out the position and the nesting is trivial. But it's not clear what that shows. It tells you what broke, but not what other things lead to the breakage. That would require guessing what assertions where supposed to have been true, which requires deep knowledge of the design.


Edit.

"seem related" and "seem unrelated" are undefinable without recourse to the actual design of the actual application and the actual assertions that should be true in each stack frame.

If you don't know the assertions that should be true, all you have is a random puddle of variables. What can you claim about "related" given a random pile of values?

Crash 1: a = 2, b = 3, c = 4 

Crash 2: a = 3, b = 4, c = 5 

Related? Unrelated? How can you classify these without knowing everything about the code? If you know everything about the code, you can formulate standard assert-statement conditions that should have been true. And then you know what the actual crash is.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文