秒差距解析许多问题

发布于 2024-09-30 02:07:32 字数 587 浏览 7 评论 0原文

我需要为编程语言创建一个解析器。我想说，到目前为止，除了一个小细节之外，已经完成了 95%。

用这种语言编写的程序具有以下结构：

outputs
inputs
expressions

要求是输出不能与输入混合。例如：

x := output of int;
y := output of in;
.....
z := input of int;
t := input of in;
.....
expressions

我可以很好地解析单个输出，但是如果我尝试使用（many1 输出）来允许多个输出，则它不起作用，因为它尝试将输入解析为输出。

我的主解析器看起来像这样：

prog =
    do outs <- many1 output
       ins <- many1 input
       exs <- expressions
       eof
       return (Prog outs ins exs)

我知道这看起来很容易，但我尝试了很多东西，但就是无法让它工作。请帮忙。

原文

I need to create a parser for a programming language. So far it is 95% done, I'd say, except for a tiny detail.

The program written in this language has the following structure:

outputs
inputs
expressions

The requirement is that outputs cannot be mixed with inputs. For example:

x := output of int;
y := output of in;
.....
z := input of int;
t := input of in;
.....
expressions

I can parse a single output just fine but if I try to use (many1 output), to allow multiple outputs, it doesn't work because it tries to parse the inputs as outputs.

My main parser looks like this:

prog =
    do outs <- many1 output
       ins <- many1 input
       exs <- expressions
       eof
       return (Prog outs ins exs)

I know it seems easy but I tried a lot of stuff and just cannot get it to work. Please help.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

-柠檬树下少年和吉他 2024-10-07 02:07:32

如果您的输出规则看起来像这样：

output = do name <- ident
            string ":= output of"
            type <- ident
            char ';'
            return $ Out name type

并且您的输入规则看起来相同，除了“input of”，那么问题是这两个规则都以 ident 开头，并且因为 parsec 不会自动回溯，它只会尝试首先应用 output，消耗 ident，然后在无法匹配“output of”时失败。

要解决此问题，您只需将 output 和 input 包装在 try 中，即

outs <- many1 (try output)
ins <- many1 (try input)

If your rule for output looks something like this:

output = do name <- ident
            string ":= output of"
            type <- ident
            char ';'
            return $ Out name type

and your input rule looks the same except with "input of", then the problem is that both rules start with an ident and since parsec doesn't backtrack automatically, it will just try to apply output first, consuming the ident and then fail when it can't match "output of".

To fix this you can just wrap output and input in try, i.e.

outs <- many1 (try output)
ins <- many1 (try input)

回复收藏 0 原文

掐死时间 2024-10-07 02:07:32

虽然 sepp2k 的答案有效，但我个人希望将回溯封装在输出和输入解析器中。

虽然这向解析器添加了代码，但它使它们更加健壮：

output = do name <- try prefix
            type <- ident
            char ';'
            return $ Out name type
  where
    prefix = do name <- ident
                string ":= output of"
                return name

对于 Parsec，通常最好避免 try（字符解析器除外）并使用左因子分解来改进语法（try会使解析器非常脆弱）。不幸的是，您正在使用的语法对左因数分解并不是特别友好，在这种情况下，它可能不值得打扰。

While sepp2k's answer works, I'd personally want to encapsulate the backtracking inside the output and input parsers.

Although this adds code to the parsers it make them more robust:

output = do name <- try prefix
            type <- ident
            char ';'
            return $ Out name type
  where
    prefix = do name <- ident
                string ":= output of"
                return name

With Parsec, its generally best to avoid try except for Char Parsers and use left factoring to improve the grammar (try can make parsers very fragile). Unfortunately the grammar you are working is not particularly friendly to left factoring and in this case it is probably not worth bothering.

回复收藏 0 原文

~没有更多了~