秒差距解析许多问题
我需要为编程语言创建一个解析器。我想说,到目前为止,除了一个小细节之外,已经完成了 95%。
用这种语言编写的程序具有以下结构:
outputs
inputs
expressions
要求是输出不能与输入混合。例如:
x := output of int;
y := output of in;
.....
z := input of int;
t := input of in;
.....
expressions
我可以很好地解析单个输出,但是如果我尝试使用(many1 输出)来允许多个输出,则它不起作用,因为它尝试将输入解析为输出。
我的主解析器看起来像这样:
prog =
do outs <- many1 output
ins <- many1 input
exs <- expressions
eof
return (Prog outs ins exs)
我知道这看起来很容易,但我尝试了很多东西,但就是无法让它工作。请帮忙。
I need to create a parser for a programming language. So far it is 95% done, I'd say, except for a tiny detail.
The program written in this language has the following structure:
outputs
inputs
expressions
The requirement is that outputs cannot be mixed with inputs. For example:
x := output of int;
y := output of in;
.....
z := input of int;
t := input of in;
.....
expressions
I can parse a single output just fine but if I try to use (many1 output), to allow multiple outputs, it doesn't work because it tries to parse the inputs as outputs.
My main parser looks like this:
prog =
do outs <- many1 output
ins <- many1 input
exs <- expressions
eof
return (Prog outs ins exs)
I know it seems easy but I tried a lot of stuff and just cannot get it to work. Please help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您的输出规则看起来像这样:
并且您的输入规则看起来相同,除了“input of”,那么问题是这两个规则都以
ident
开头,并且因为 parsec 不会自动回溯,它只会尝试首先应用output
,消耗ident
,然后在无法匹配“output of”时失败。要解决此问题,您只需将
output
和input
包装在try
中,即If your rule for output looks something like this:
and your input rule looks the same except with "input of", then the problem is that both rules start with an
ident
and since parsec doesn't backtrack automatically, it will just try to applyoutput
first, consuming theident
and then fail when it can't match "output of".To fix this you can just wrap
output
andinput
intry
, i.e.虽然 sepp2k 的答案有效,但我个人希望将回溯封装在输出和输入解析器中。
虽然这向解析器添加了代码,但它使它们更加健壮:
对于 Parsec,通常最好避免 try(字符解析器除外)并使用左因子分解来改进语法(try会使解析器非常脆弱)。不幸的是,您正在使用的语法对左因数分解并不是特别友好,在这种情况下,它可能不值得打扰。
While sepp2k's answer works, I'd personally want to encapsulate the backtracking inside the output and input parsers.
Although this adds code to the parsers it make them more robust:
With Parsec, its generally best to avoid try except for Char Parsers and use left factoring to improve the grammar (try can make parsers very fragile). Unfortunately the grammar you are working is not particularly friendly to left factoring and in this case it is probably not worth bothering.