到达定义数据流问题的特例
定义达成问题是数据流分析中最基本的问题之一。给定一个包含变量定义和用途的控制流图,问题会导致计算哪些变量定义可以达到特定用途。
例如,考虑流程图:
____________
1:| x <- ... |
------------
| \
| __________
| 2:| x <- ... |
| -----------
| /
____________
3:| ... <- x |
------------
块 3 中变量 x 的使用可以从块 1 或块 2 中的定义实现。
计算哪些定义可以实现使用的算法是经典的数据流问题。使用龙编译器书(新版)中的符号,到达定义数据流问题如下:
域:定义集(例如 {x <- .., ...})
方向:前进
传递函数:fb(x) = gen(B) U (x - Kill(B)) 其中 gen(B) 是块 B 生成的定义集,kill(B) 是块 B 杀死的定义集
边界:OUT[ENTRY] = {} 即没有定义流进入函数
满足运算符:U(union),即流向块的定义是前驱块中的定义的并集。
方程:OUT[B] = fb(IN[B]), IN[B] = U(P in pred)OUT[P]
初始化: OUT[B] = {}
但是,并非所有定义都是相同的。例如,块 1 中的定义可能永远不会在块 3 中使用,因为它可能会被块 2 中的定义杀死。另一方面,块 2 中的定义如果被执行,将保留其值,直到在块中使用为止3.
我想找到一个用法的到达定义,该用法在从定义到用法的任何路径上都没有终止定义。我的问题是是否存在类似的数据流问题(可能是传播等)。从数据流分析的角度如何解决。
我确实有一个可能的解决方案来解决这个问题,但如果解决方案已经存在,我不想重新发明轮子。
The problem of reaching definitions is one of the most fundamental problem in data flow analysis. Given a control flow graph which contains variable definitions and uses, the problem results into calculating which variable definitions may reach a specific use.
For example consider the flow graph:
____________
1:| x <- ... |
------------
| \
| __________
| 2:| x <- ... |
| -----------
| /
____________
3:| ... <- x |
------------
The use of variable x in block 3 may be reached from either definitions in block 1 or block 2.
The algorithm for computing which definitions may reach a use is classic data flow problem. Using the notation from the dragon compiler book (new edition) the reaching definitions data flow problem is as follows:
Domain : Sets of definitions (e.g. {x <- .., ...})
Direction : Forward
Transfer function : fb(x) = gen(B) U (x - kill(B)) where gen(B) is the set of definitions that block B generates and kill(B) the set of definitions that block B kills
Boundary : OUT[ENTRY] = {} i.e. no definitions flow for entry to a function
Meet operator: U(union), i.e. the definitions that flow to a block is the union of definitions out of predecessor blocks.
Equations : OUT[B] = fb(IN[B]), IN[B] = U(P in pred)OUT[P]
Initialize: OUT[B] = {}
However, not all definitions are the same. For example the definition in block 1 may never reach the use in block 3 as it may be killed by the definition in block 2. On the other hand, the definition in block 2, if executed, will preserve its value until its usage in block 3.
I want to find the reaching definitions of a usage for which there is no killing definitions on any path from the definition to the usage. My question is whether a similar data flow problem exists (maybe propagation etc.). How it can be solved in terms of data flow analysis.
I do have one possible solution to the problem but I would not want to reinvent the wheel if a solution already exists.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
像这样更改问题定义:
满足运算符:∩(相交),即流向块的定义是前驱块的定义的交集。
方程:OUT[B] = fb(IN[B]),
IN[B] = ∩(P in pred)OUT[P]
Change the problem definition like this:
Meet operator: ∩(Intersect), i.e. the definitions that flow to a block is the intersection of definitions out of predecessor blocks.
Equations : OUT[B] = fb(IN[B]),
IN[B] = ∩(P in pred)OUT[P]
这是我解决问题的方法。这可能不是最有效的方法,但我相信它有效。我将把这个问题称为保留定义的问题。首先我计算到达的定义。我正在对位集表示的定义集使用迭代数据流算法。
为了做到这一点,我需要首先计算每个块的 gen(B) 和kill(B)。这些是分别由每个块生成和终止的定义。请注意,这里的kill(B)是实际kill(B)的超集,因为我不知道真正被杀死的定义和块,因为我此时没有考虑数据流。
应用到达定义后,我为控制流图中的每个块设置了 REACH_IN(B) 和 REACH_OUT(B)。我知道保留的定义是到达定义的子集。为了计算它们,我需要找出从程序入口到每个块的哪些定义永远不会被删除。我将这些集合称为“无终止集合”,并且我将提供一种算法来计算图中每个块的 NO_KILL_IN(B) 和 NO_KILL_OUT(B)。这是数据流分析方面的算法。
域:定义集(例如 {x <- .., ...})
方向:前进
传递函数:fb(x) = x - (kill(B) ∩ REACH_IN(B)) 其中kill(B) 是阻止B 杀死的定义集,REACH_IN(B) 是流入B 的定义集。
边界:NO_KILL_OUT[ENTRY] = U(通用集),即所有定义都不会从函数的入口处终止
相遇运算符:∩(交集),即如果定义没有在任何前驱块中被杀死,则该定义不会被杀死。
方程:NO_KILL_OUT[B] = fb(IN[B]), NO_KILL_IN[B] = ∩(P in pred)NO_KILL_OUT[P]
初始化:NO_KILL_OUT[B] = U
请注意,在转移函数中,我们计算kill(B) ∩ REACH_IN(B),这是块B中被杀死的实际定义集。我们不使用的话我们会过于悲观。该算法计算每个块之前和之后哪些定义不能被删除,而不考虑它们是否已生成。为了计算保留的定义,我们只需执行交集:
PRESERVE_IN(B) = REACH_IN(B) ∩ NO_KILL_IN(B)
PRESERVE_OUT(B) = REACH_OUT(B) ∩ NO_KILL_OUT(B)
Here is how I solved the problem. It may not be the most efficient way but I believe it works. I am going to refer to the problem as the problem of preserved definitions. At first I calculate the reaching definitions. I am using the iterative data flow algorithm on sets of definitions represented by bitsets.
In order to do so I need to to compute at first the gen(B) and kill(B) of every block. These are the definitions generated and killed by each block respectively. Note here that the kill(B) is a superset of the actual kill(B) because I do not know what definitions and from what blocks are really being killed since I do not take into acount the data flow at this point.
After applying the reaching definitions I have REACH_IN(B) and REACH_OUT(B) sets for each block in the control flow graph. I know that the preserved definitions are a subset of the reaching definitions. In order to compute them I need to find out what definitions could never have been killed from the entry of the program to each block. I will call these sets no kill sets and I will provide an algorithm to compute NO_KILL_IN(B) and NO_KILL_OUT(B) for each block in the graph. Here is the algorithm in terms of data flow analysis.
Domain : Sets of definitions (e.g. {x <- .., ...})
Direction : Forward
Transfer function : fb(x) = x - (kill(B) ∩ REACH_IN(B)) where kill(B) is the set of definitions that block B kills and REACH_IN(B) the set of definitions flowing into B.
Boundary : NO_KILL_OUT[ENTRY] = U (universal set) i.e. all definitions are not killed from entry of the function
Meet operator: ∩(intersection), i.e. a definition is not killed if it is not killed in any of predecessor block.
Equations : NO_KILL_OUT[B] = fb(IN[B]), NO_KILL_IN[B] = ∩(P in pred)NO_KILL_OUT[P]
Initialize: NO_KILL_OUT[B] = U
Note that in the tranfer function we compute the kill(B) ∩ REACH_IN(B) which is the set of actual definitions being killed in block B. If we do not use that we will be overly perssimistic. The algorith calculates which definitions could not have been killed before and after each block without taking into account if they have been generated or not. In order to compute the preserved definitions we simply preform the intersection:
PRESERVE_IN(B) = REACH_IN(B) ∩ NO_KILL_IN(B)
PRESERVE_OUT(B) = REACH_OUT(B) ∩ NO_KILL_OUT(B)
您可能想看看时态逻辑,计算树逻辑非常适合定义控制流图中路径的属性。本文显示了 CTL 中数据流属性的一些示例:
证明通过时序逻辑优化编译器的正确性
You may want to have a look at temporal logics instead, the computation tree logic lends itself pretty to defining properties over paths in a control flow graph. Some example of data flow properties in CTL are shown in this paper:
Proving correctness of compiler optimizations by temporal logic