正式构建控制流图

发布于 2024-09-16 11:48:29 字数 2469 浏览 6 评论 0原文

我正在为大学项目编写一个编译器,我想将我的抽象语法树转换为控制流图(CFG)。

我认为 CFG 中的节点(V)应该是来自 AST 的节点。我知道如何在算法上构造边缘集 (G=(V,E)),但我很难更正式地编写该过程,

我创建了这个 scala 样式模式匹配(伪) :

def edges(n:Node)(nestedin_next: Node) : List[(Node,Node)] = 
    n match {
       case (c_1 :: c_2::tl) => (c1,c2) :: edges(c2::tl)(nestedin_next)++
                                   edges(c_1)(c_2)//recurse
       case c_1 :: Nil => (c_1,nestedin_next)::Nil
       case  i@ IF(_,c1,c2) => (i,c1)::(i,c2)::edges(c1)(nestedin_next)++
                                edges(c2)(nestedin_next)
       case _ => Nil
     }

应匹配 AST 结构,如:

( IF(1,
       ASSIGN(x,1), // ia1
       ASSIGN(x,2) // ia2
     ) ::  // i1
  ASSIGN(y,2) ::  // a1
  ASSIGN(z,ADD(x,y)) :: //a2 
  IF(z, 
       RET(z), //i2r1
         assign(z,0):: // i2a1
         ret(z) // i2r2
  ) :://i2
   Nil
)

并提供边缘集,如:

{ i1 -> ia1,
   i1 -> ia2,
   ia1 -> a1,
   ia2 -> a1,
   a1 -> a2,
   a2 -> i2,
   i2 -> i2r1
   i2-> i2a1
   i2a1 -> i2r2
   i2r2 -> _|_
   i2r1 -> _|_ 
}

CFG(dot) DotSrc

任何人都可以获得有关如何执行此操作的任何提示比scala“伪代码”更正式?

我在想一些归纳性的东西,比如:(

e[[ IF(_,b1,b2) ]] = (if -> b1) + (if -> b2) \cup e[[ b1 ]] \cup e[[ b2 ]]
e[[ b1, b2 ]] = e[[b1]] \cup e[[b2]]

上面只会给出一棵树,而不是一个图。例如,从 then-branch 的边缘到下一个语句没有边缘)

编辑:

我一直在阅读 kiama 和 dataflows for scala,我喜欢他们使用的“succ”和“following”方法。尽管如此,我很难将其归结为更正式的描述,主要是因为漂亮的 childAttrs.next 隐藏了一些细节,当我尝试正式指定它时很难看。

EDIT2:

我已经阅读了 Dragon Book 和“ML 中的现代编译器实现”以及来自 学习编写编译器 以及一些/大多数提到数据流和控制流,但从未涉及如何以任何正式方式创建 CFG。

编辑3:

通过 Kiama 作者,副教授 Tony Sloane 博士 我收到了一些要查找的其他书籍参考

据我所知,这些书中的“实现方法”更多地基于程序的“每个语句”,而不是 AST,并且基于基本块。尽管如此,输入还是很棒的!

Im writing a compiler for university project, and I would like to transform my Abstract Syntax Tree into a Control Flow Graph(CFG).

Im thinking that the nodes(V) in the CFG should be nodes from the AST. I know algorithmically how to construct the edge set (G=(V,E)) but Im having a hard time writing the process a bit more formally

I've created this scala style pattern matching (Pseudo):

def edges(n:Node)(nestedin_next: Node) : List[(Node,Node)] = 
    n match {
       case (c_1 :: c_2::tl) => (c1,c2) :: edges(c2::tl)(nestedin_next)++
                                   edges(c_1)(c_2)//recurse
       case c_1 :: Nil => (c_1,nestedin_next)::Nil
       case  i@ IF(_,c1,c2) => (i,c1)::(i,c2)::edges(c1)(nestedin_next)++
                                edges(c2)(nestedin_next)
       case _ => Nil
     }

Which should match an AST structure like:

( IF(1,
       ASSIGN(x,1), // ia1
       ASSIGN(x,2) // ia2
     ) ::  // i1
  ASSIGN(y,2) ::  // a1
  ASSIGN(z,ADD(x,y)) :: //a2 
  IF(z, 
       RET(z), //i2r1
         assign(z,0):: // i2a1
         ret(z) // i2r2
  ) :://i2
   Nil
)

and provide an edgeset like:

{ i1 -> ia1,
   i1 -> ia2,
   ia1 -> a1,
   ia2 -> a1,
   a1 -> a2,
   a2 -> i2,
   i2 -> i2r1
   i2-> i2a1
   i2a1 -> i2r2
   i2r2 -> _|_
   i2r1 -> _|_ 
}

CFG(dot)
DotSrc

Anyone got any hints on how to do this a bit more formally than scala "pseudocode"?

Im thinking something inductive like:

e[[ IF(_,b1,b2) ]] = (if -> b1) + (if -> b2) \cup e[[ b1 ]] \cup e[[ b2 ]]
e[[ b1, b2 ]] = e[[b1]] \cup e[[b2]]

(the above would only give a tree and not a graph though. No edge from edge of then-branch to next statement for example)

EDIT:

I've been reading up on kiama and dataflows for scala, and I like the "succ" and "following" approach they use. Nevertheless, I'm having a hard time boiling that down into a more formal description, mostly because of the nifty childAttr, s.next which hides some of the details that turns ugly when I try to specify it formally.

EDIT2:

I've been through the Dragon Book and "Modern Compiler Implementation in ML" as well as some of the other material from Learning to write a compiler and some/most mentions data flow and control flow, but never touches much upon HOW to create the CFG in any formal way.

EDIT3:

Via Kiama author, Associate Professor Dr. Tony Sloane I recieved some additional book references to look up.

As far as I can see the "way to do it" as per those books is based on a "per statement" of the program more than over the AST and is based on Basic Blocks. Great input nevertheless!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

醉殇 2024-09-23 11:48:29

Google 的 Closure 编译器实现了 控制流分析,它将 JavaScript 的 AST 转换为控制流图。此实现的想法受到论文的启发:Java 源代码的声明式过程内流程分析

Google's Closure Compiler implements a Control-Flow Analysis which transforms an AST for JavaScript into a Control-Flow Graph. The ideas for this implementation are inspired from the paper: Declarative Intraprocedural Flow Analysis of Java Source Code.

唠甜嗑 2024-09-23 11:48:29

如果您的目的只是创建一些看起来更正式的东西,那么您可以使用 标准符号。您应该用单个减少步骤来表达它,而不是递归地表达,因为这样就足以简单地继续应用这些规则,直到无法应用更多规则为止。

也就是说,这个定义本质上与你的 scala 代码表达的内容完全相同。如果你真的想做任何“正式”的事情,你需要证明的属性是:

  • 你的 CFG 翻译算法总是终止
  • 你的 CFG 相对于给定的 AST 输入是否是最小的
  • 对于给定的 AST,你的算法是否可以导出唯一的 CFG输入(即它生成的 CFG 不是不确定的)。

我也不认为你的基本块方法(而不是每语句方法)一定是一个坏主意。如果您可以匹配一个基本块,您就可以编写一条规则,根据该匹配的存在对集合成员身份进行断言,这似乎是完全合理的。看来您开始绘制的归纳定义可以很好地工作。

其他有趣的事情可能是尝试将(正式)结构化操作语义与您的 CFG 构建联系起来。这方面可能已经有工作了,但我只是粗略地进行了谷歌搜索,并没有发现两者之间有任何明确的关系,但直观上似乎应该存在一种关系。

If your intention is to simply create something that looks a bit more formal, then you could express these matching operations as inference rules using the standard notation. You should express it in terms of a single step of reduction, rather than recursively, because then it is sufficient to simply keep applying these rules until no more can be applied.

That said, this definition is essentially going to say exactly the same thing as your scala code. If you really want to do anything "formal" the properties you need to prove are:

  • Your CFG translation algorithm always terminates
  • Whether your CFG is minimal with respect to a given AST input
  • Whether there is a unique CFG derivable by your algorithm for a given AST input (i.e. it's not non-deterministic which CFG it produces).

I don't think your basic blocks approach (rather than a per-statement approach) is necessarily a bad idea, either. It seems perfectly reasonable that if you can match a basic block, you can write a rule that makes assertions about set membership based upon the presence of this match. It seems like the inductive definition you started sketching could work just fine.

Something else interesting might be to try to relate (formally) structured operational semantics and your construction of CFGs. There might already be work in this area, but I only did a cursory google search and didn't find any clearly stated relationship between the two, but intuitively it seems like one should exist.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文