通过递归下降从该语法生成表达式
我有一个简单的语法。实际上,我使用的语法更复杂,但这是说明我的问题的最小子集。
Expr ::= Value Suffix
| "(" Expr ")" Suffix
Suffix ::= "->" Expr
| "<-" Expr
| Expr
| epsilon
Value
匹配标识符、字符串、数字等。 Suffix
规则是为了消除左递归。这匹配如下表达式:
a -> b (c -> (d) (e))
也就是说,一个图形,其中 a
既转到 b
又转到 (c -> (d) (e)) 的结果
,c
转到 d
和 e
。我试图为这些表达式生成一个抽象语法树,但我遇到了困难,因为所有运算符都可以接受任意数量的操作数。我宁愿将生成 AST 的逻辑保留在递归下降解析方法中,因为它避免了重复提取表达式的逻辑。我当前的策略如下:
如果出现
Value
,则将其推送到输出。如果出现
From
或To
:输出分隔符。
获取下一个
Expr
。创建一个
Link
节点。将输出中的第一组操作数弹出到
Link
中,直到出现分隔符。擦除发现的分隔符。
将第二组操作数弹出到
Link
中,直到出现分隔符。将
Link
推送到输出。
如果我在不遵守步骤 2.3–2.7 的情况下运行此操作,我会得到一个值和分隔符的列表。对于上面引用的表达式,a -> b (c -> (d) (e))
,输出应该是:
A sep_1 B sep_2 C sep_3 D E
应用 To
规则将产生:
A sep_1 B sep_2 (link from C to {D, E})
随后:
(link from A to {B, (link from C to {D, E})})
需要注意的重要一点是 sep_2
,对于分隔第二个 ->
的左侧操作数至关重要,没有出现,因此解析器认为该表达式实际上是编写的:
a -> (b c -> (d) (e))
为了解决这个问题根据我当前的策略,我需要一种在相邻表达式之间生成分隔符的方法,但前提是当前表达式是括在括号中的 From
或 To
表达式。如果可能的话,那么我只是没有看到它,答案应该很简单。但是,如果有更好的方法来解决这个问题,请告诉我!
I've got a simple grammar. Actually, the grammar I'm using is more complex, but this is the smallest subset that illustrates my question.
Expr ::= Value Suffix
| "(" Expr ")" Suffix
Suffix ::= "->" Expr
| "<-" Expr
| Expr
| epsilon
Value
matches identifiers, strings, numbers, et cetera. The Suffix
rule is there to eliminate left-recursion. This matches expressions such as:
a -> b (c -> (d) (e))
That is, a graph where a
goes to both b
and the result of (c -> (d) (e))
, and c
goes to d
and e
. I'm trying to produce an abstract syntax tree for these expressions, but I'm running into difficulty because all of the operators can accept any number of operands on each side. I'd rather keep the logic for producing the AST within the recursive descent parsing methods, since it avoids having to duplicate the logic of extracting an expression. My current strategy is as follows:
If a
Value
appears, push it to the output.If a
From
orTo
appears:Output a separator.
Get the next
Expr
.Create a
Link
node.Pop the first set of operands from output into the
Link
until a separator appears.Erase the separator discovered.
Pop the second set of operands into the
Link
until a separator.Push the
Link
to the output.
If I run this through without obeying steps 2.3–2.7, I get a list of values and separators. For the expression quoted above, a -> b (c -> (d) (e))
, the output should be:
A sep_1 B sep_2 C sep_3 D E
Applying the To
rule would then yield:
A sep_1 B sep_2 (link from C to {D, E})
And subsequently:
(link from A to {B, (link from C to {D, E})})
The important thing to note is that sep_2
, crucial to delimit the left-hand operands of the second ->
, does not appear, so the parser believes that the expression was actually written:
a -> (b c -> (d) (e))
In order to solve this with my current strategy, I would need a way to produce a separator between adjacent expressions, but only if the current expression is a From
or To
expression enclosed in parentheses. If that's possible, then I'm just not seeing it and the answer ought to be simple. If there's a better way to go about this, however, then please let me know!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我没有尝试详细分析它,但是:“
From
或To
表达式用括号括起来”开始听起来很像“上下文相关”,递归下降无法直接处理。为了避免上下文依赖,您可能需要单独产生括号中的From
或To
与From
或To< /code> 不带括号。
编辑:虽然可能为时已晚,做任何好事,如果我对你想要匹配的内容的理解是正确的,我想我会写得更像这样:
很难确定,但我认为这至少应该是接近匹配(仅)您想要的输入,并且应该可以相当容易地生成正确反映输入的 AST。
I haven't tried to analyze it in detail, but: "
From
orTo
expression enclosed in parentheses" starts to sound a lot like "context dependent", which recursive descent can't handle directly. To avoid context dependence you'll probably need a separate production for aFrom
orTo
in parentheses vs. aFrom
orTo
without the parens.Edit: Though it may be too late to do any good, if my understanding of what you want to match is correct, I think I'd write it more like this:
It's hard to be certain, but I think this should at least be close to matching (only) the inputs you want, and should make it reasonably easy to generate an AST that reflects the input correctly.