正式构建控制流图
我正在为大学项目编写一个编译器,我想将我的抽象语法树转换为控制流图(CFG)。
我认为 CFG 中的节点(V
)应该是来自 AST 的节点。我知道如何在算法上构造边缘集 (G=(V,E)
),但我很难更正式地编写该过程,
我创建了这个 scala 样式模式匹配(伪) :
def edges(n:Node)(nestedin_next: Node) : List[(Node,Node)] =
n match {
case (c_1 :: c_2::tl) => (c1,c2) :: edges(c2::tl)(nestedin_next)++
edges(c_1)(c_2)//recurse
case c_1 :: Nil => (c_1,nestedin_next)::Nil
case i@ IF(_,c1,c2) => (i,c1)::(i,c2)::edges(c1)(nestedin_next)++
edges(c2)(nestedin_next)
case _ => Nil
}
应匹配 AST 结构,如:
( IF(1,
ASSIGN(x,1), // ia1
ASSIGN(x,2) // ia2
) :: // i1
ASSIGN(y,2) :: // a1
ASSIGN(z,ADD(x,y)) :: //a2
IF(z,
RET(z), //i2r1
assign(z,0):: // i2a1
ret(z) // i2r2
) :://i2
Nil
)
并提供边缘集,如:
{ i1 -> ia1,
i1 -> ia2,
ia1 -> a1,
ia2 -> a1,
a1 -> a2,
a2 -> i2,
i2 -> i2r1
i2-> i2a1
i2a1 -> i2r2
i2r2 -> _|_
i2r1 -> _|_
}
任何人都可以获得有关如何执行此操作的任何提示比scala“伪代码”更正式?
我在想一些归纳性的东西,比如:(
e[[ IF(_,b1,b2) ]] = (if -> b1) + (if -> b2) \cup e[[ b1 ]] \cup e[[ b2 ]]
e[[ b1, b2 ]] = e[[b1]] \cup e[[b2]]
上面只会给出一棵树,而不是一个图。例如,从 then-branch 的边缘到下一个语句没有边缘)
编辑:
我一直在阅读 kiama 和 dataflows for scala,我喜欢他们使用的“succ”和“following”方法。尽管如此,我很难将其归结为更正式的描述,主要是因为漂亮的 childAttr
、s.next
隐藏了一些细节,当我尝试正式指定它时很难看。
EDIT2:
我已经阅读了 Dragon Book 和“ML 中的现代编译器实现”以及来自 学习编写编译器 以及一些/大多数提到数据流和控制流,但从未涉及如何以任何正式方式创建 CFG。
编辑3:
通过 Kiama 作者,副教授 Tony Sloane 博士 我收到了一些要查找的其他书籍参考。
据我所知,这些书中的“实现方法”更多地基于程序的“每个语句”,而不是 AST,并且基于基本块。尽管如此,输入还是很棒的!
Im writing a compiler for university project, and I would like to transform my Abstract Syntax Tree into a Control Flow Graph(CFG).
Im thinking that the nodes(V
) in the CFG should be nodes from the AST. I know algorithmically how to construct the edge set (G=(V,E)
) but Im having a hard time writing the process a bit more formally
I've created this scala style pattern matching (Pseudo):
def edges(n:Node)(nestedin_next: Node) : List[(Node,Node)] =
n match {
case (c_1 :: c_2::tl) => (c1,c2) :: edges(c2::tl)(nestedin_next)++
edges(c_1)(c_2)//recurse
case c_1 :: Nil => (c_1,nestedin_next)::Nil
case i@ IF(_,c1,c2) => (i,c1)::(i,c2)::edges(c1)(nestedin_next)++
edges(c2)(nestedin_next)
case _ => Nil
}
Which should match an AST structure like:
( IF(1,
ASSIGN(x,1), // ia1
ASSIGN(x,2) // ia2
) :: // i1
ASSIGN(y,2) :: // a1
ASSIGN(z,ADD(x,y)) :: //a2
IF(z,
RET(z), //i2r1
assign(z,0):: // i2a1
ret(z) // i2r2
) :://i2
Nil
)
and provide an edgeset like:
{ i1 -> ia1,
i1 -> ia2,
ia1 -> a1,
ia2 -> a1,
a1 -> a2,
a2 -> i2,
i2 -> i2r1
i2-> i2a1
i2a1 -> i2r2
i2r2 -> _|_
i2r1 -> _|_
}
Anyone got any hints on how to do this a bit more formally than scala "pseudocode"?
Im thinking something inductive like:
e[[ IF(_,b1,b2) ]] = (if -> b1) + (if -> b2) \cup e[[ b1 ]] \cup e[[ b2 ]]
e[[ b1, b2 ]] = e[[b1]] \cup e[[b2]]
(the above would only give a tree and not a graph though. No edge from edge of then-branch to next statement for example)
EDIT:
I've been reading up on kiama and dataflows for scala, and I like the "succ" and "following" approach they use. Nevertheless, I'm having a hard time boiling that down into a more formal description, mostly because of the nifty childAttr
, s.next
which hides some of the details that turns ugly when I try to specify it formally.
EDIT2:
I've been through the Dragon Book and "Modern Compiler Implementation in ML" as well as some of the other material from Learning to write a compiler and some/most mentions data flow and control flow, but never touches much upon HOW to create the CFG in any formal way.
EDIT3:
Via Kiama author, Associate Professor Dr. Tony Sloane I recieved some additional book references to look up.
As far as I can see the "way to do it" as per those books is based on a "per statement" of the program more than over the AST and is based on Basic Blocks. Great input nevertheless!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Google 的 Closure 编译器实现了 控制流分析,它将 JavaScript 的 AST 转换为控制流图。此实现的想法受到论文的启发:Java 源代码的声明式过程内流程分析。
Google's Closure Compiler implements a Control-Flow Analysis which transforms an AST for JavaScript into a Control-Flow Graph. The ideas for this implementation are inspired from the paper: Declarative Intraprocedural Flow Analysis of Java Source Code.
如果您的目的只是创建一些看起来更正式的东西,那么您可以使用 标准符号。您应该用单个减少步骤来表达它,而不是递归地表达,因为这样就足以简单地继续应用这些规则,直到无法应用更多规则为止。
也就是说,这个定义本质上与你的 scala 代码表达的内容完全相同。如果你真的想做任何“正式”的事情,你需要证明的属性是:
我也不认为你的基本块方法(而不是每语句方法)一定是一个坏主意。如果您可以匹配一个基本块,您就可以编写一条规则,根据该匹配的存在对集合成员身份进行断言,这似乎是完全合理的。看来您开始绘制的归纳定义可以很好地工作。
其他有趣的事情可能是尝试将(正式)结构化操作语义与您的 CFG 构建联系起来。这方面可能已经有工作了,但我只是粗略地进行了谷歌搜索,并没有发现两者之间有任何明确的关系,但直观上似乎应该存在一种关系。
If your intention is to simply create something that looks a bit more formal, then you could express these matching operations as inference rules using the standard notation. You should express it in terms of a single step of reduction, rather than recursively, because then it is sufficient to simply keep applying these rules until no more can be applied.
That said, this definition is essentially going to say exactly the same thing as your scala code. If you really want to do anything "formal" the properties you need to prove are:
I don't think your basic blocks approach (rather than a per-statement approach) is necessarily a bad idea, either. It seems perfectly reasonable that if you can match a basic block, you can write a rule that makes assertions about set membership based upon the presence of this match. It seems like the inductive definition you started sketching could work just fine.
Something else interesting might be to try to relate (formally) structured operational semantics and your construction of CFGs. There might already be work in this area, but I only did a cursory google search and didn't find any clearly stated relationship between the two, but intuitively it seems like one should exist.