将 CFG 转换为 IL

发布于 2024-08-02 14:48:54 字数 489 浏览 12 评论 0原文

我从任意 IL 构建一个 CFG,并希望将该 CFG 转换回 IL。 CFG中顶点的顺序当然不等于原始IL指令的顺序。

这很好,但有些东西过于复杂。想象一下:

Jump 'B'
'C': Return
'B': Jump 'C'

这将产生如下所示的流程图: (Jump B) -> (跳转C)-> (返回) 这当然是一个简化的示例,但它显示了从 CFG 转换出来时的问题。

学术界有关于这个主题的任何信息吗?我认为自下而上遍历图表会非常优雅,但在更复杂的情况下不起作用。

一个解决方案可能是自上而下并搜索 CF 合并,但在这种情况下我将无法正确处理循环。因此,解决这个问题的唯一方法似乎是搜索可能的 CF 合并(如果发生)。如果没有,我们必须有一个循环,这意味着循环是首选,并且随后评估连续路径。这听起来像是一个可以解决的问题,但它也非常昂贵,并且可能存在更优雅的解决方案。此外,当考虑“break”语句时,循环也可能导致 CF 合并。

I build a CFG out of an arbitrary IL and want to convert that CFG back to IL. The order of the vertices in the CFG is of course not equal to the order of the original IL instructions.

This is fine but overcomplicates some stuff. Imagine:

Jump 'B'
'C': Return
'B': Jump 'C'

This would result in a flow graph like this: (Jump B) -> (Jump C) -> (Return)
This is of course a simplified example but it shows the problem when converting out of the CFG.

Is there any information available on this topic in academia? I thought traversing the graph bottom-up would be very elegant but that does not work in more complicated cases.

A solution might be to walk top-down and search for a CF merge but in that case I would not be able to handle loops correct. So the only way to get this right seems to be to search for a possible CF merge if it occurs. If not, we have to have a loop, which means the loop is preferred and the continuing path is evaluated afterwards. This sounds like a solveable problem but it is also very expensive and there might exist a more elegant solution to the problem. Besides a loop could also result in a CF merge when thinking about a "break" statement.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

甜`诱少女 2024-08-09 14:48:54

将 CFG 转换为 IL:您想要遍历图,将每个顶点恰好发出一次(除了那些无法到达的顶点)。因此,您需要记录已发出的顶点:顶点上的标志即可,或者从顶点到 True/False 的散列。

有些顶点会有多个后继者,你只能直接跟随其中一个;所以你需要一种方法来跟踪你想要稍后返回的顶点。队列适合于此。

这或多或少是我用过的。

def needs_label(cfg, v, last):
    if cfg.predecessors(v) > 1:
        # There's more than one way of entering this vertex
        return True
    elif cfg.predecessors(v) == 1 and last != cfg.predecessors(v)[0]:
        # There's only one way, but the last vertex emitted was not that way
        # so it will be entered using a jump.
        return True
    else:
        return False

def emit_label(v):
    print 'label%d' % (v.id)

def emit_vertex(v):
    if v.type == 'branch':
        # Branch to second successor
        print 'br label%d' % cfg.successors(v)[1].id
    else:
        ...

def emit_jump(v):
    print 'jmp label%d' % v.id

def emit_cfg(cfg):
    q = Queue()   # Queue for saving vertices that we want to emit later
    done = {}    # Hash recording which vertices have already been emitted
    q.push(cfg.start())
    while not q.empty():
        v = q.pop()
        last = None
        while v is not None and not done[v]:
            # Emit the vertex, with a prefixed label if necessary
            if needs_label(cfg, v, last):
                emit_label(v)
            emit_vertex(v)
            done[v] = True
            last = v
            # Get the vertex's successors
            succs = cfg.successors(v)
            # If there aren't any, then this path is finished, so go back to
            # the outer loop to pop another saved vertex
            if len(succs) == 0:
                v = None   # Setting this will terminate the inner loop
                continue
            # Stick all the vertices on the stack for later, in case we can't
            # process them all here
            for s in succs:
                q.push(s)
            # Pick a new vertex from the list of successors.  Always pick the first
            # because if it's a branch then the second will have been branched on
            v = succs[0]
            # If it was emitted earlier we need to jump to it
            if done[v]:
                emit_jump(v)
                v = None
            # Otherwise continue the inner loop, walking from the new vertex

对分支(具有多个后继的顶点)的处理非常幼稚:通常您想找出哪个更有可能,并在可能的情况下直接遵循该分支。

Converting CFG into IL: you want to walk over the graph, emitting each vertex exactly once (except those that are unreachable). So you need to record which vertices have been emitted: a flag on the vertex would do, or a hash from vertex to True/False.

Some vertices will have more than one successor, and you can only follow one of them directly; so you want a way to keep track of vertices that you want to come back to later. A queue is suitable for this.

This is more-or-less what I've used.

def needs_label(cfg, v, last):
    if cfg.predecessors(v) > 1:
        # There's more than one way of entering this vertex
        return True
    elif cfg.predecessors(v) == 1 and last != cfg.predecessors(v)[0]:
        # There's only one way, but the last vertex emitted was not that way
        # so it will be entered using a jump.
        return True
    else:
        return False

def emit_label(v):
    print 'label%d' % (v.id)

def emit_vertex(v):
    if v.type == 'branch':
        # Branch to second successor
        print 'br label%d' % cfg.successors(v)[1].id
    else:
        ...

def emit_jump(v):
    print 'jmp label%d' % v.id

def emit_cfg(cfg):
    q = Queue()   # Queue for saving vertices that we want to emit later
    done = {}    # Hash recording which vertices have already been emitted
    q.push(cfg.start())
    while not q.empty():
        v = q.pop()
        last = None
        while v is not None and not done[v]:
            # Emit the vertex, with a prefixed label if necessary
            if needs_label(cfg, v, last):
                emit_label(v)
            emit_vertex(v)
            done[v] = True
            last = v
            # Get the vertex's successors
            succs = cfg.successors(v)
            # If there aren't any, then this path is finished, so go back to
            # the outer loop to pop another saved vertex
            if len(succs) == 0:
                v = None   # Setting this will terminate the inner loop
                continue
            # Stick all the vertices on the stack for later, in case we can't
            # process them all here
            for s in succs:
                q.push(s)
            # Pick a new vertex from the list of successors.  Always pick the first
            # because if it's a branch then the second will have been branched on
            v = succs[0]
            # If it was emitted earlier we need to jump to it
            if done[v]:
                emit_jump(v)
                v = None
            # Otherwise continue the inner loop, walking from the new vertex

Treatment of branches (vertices with more than one successor) is pretty naive: normally you want to figure out which is more likely and follow that one directly, if possible.

情深如许 2024-08-09 14:48:54

这比听起来更容易。基本上,人们可以去掉 CFG 中的任何跳转语句,从而得到优化的图。一旦图形线性化,跳转语句将被插入回来。这不会保留指令的原始顺序,但会产生具有相同控制流的方法。

This is easier than it sounds like. Basically one may get rid of any Jump statements in the CFG which results in an optimized graph. Jump statements will be inserted back once the graph is linearized. This does not keep the original order of instructions but results in a method with the same control flow.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文