PEG语法解析,表达式以负数开头时出错

发布于 2025-01-16 06:14:28 字数 938 浏览 0 评论 0原文

我定义了以下 PEG 语法:

Program = _{ SOI ~ Expr ~ EOF }

Expr = { UnaryExpr | BinaryExpr }

Term = _{Int | "(" ~ Expr ~ ")" }

UnaryExpr = { Operator ~ Term }

BinaryExpr = { Term ~ (Operator ~ Term)* }

Operator = { "+" | "-" | "*" | "^" }

Int = @{ Operator? ~ ASCII_DIGIT+ }

WHITESPACE = _{ " " | "\t" }

EOF = _{ EOI | ";" }

并且以下表达式均已正确解析:

1 + 2   
1 - 2    
1 + -2   
1 - -2   
1        
+1       
-1

但任何以负数开头的表达式错误

-1 + 2

错误

  --> 1:4
  |
1 | -1 + 2
  |    ^---
  |
  = expected EOI

我所期望的(我想要的)是针对 -1 + 21 + -2 的处理方式相同,即由两个一元表达式组成的二元表达式。

我尝试过很多变体但没有成功。而且,如果需要的话,我愿意使用完全不同的范例,但我真的很想保留 UnaryExpression 的想法,因为我已经围绕它构建了我的解析器。

我是 PEG 新手,所以我将不胜感激。

就其价值而言,我使用 Rust v1.59 和 https://pest.rs/ 来解析并测试我的表达。

I have the following PEG grammar defined:

Program = _{ SOI ~ Expr ~ EOF }

Expr = { UnaryExpr | BinaryExpr }

Term = _{Int | "(" ~ Expr ~ ")" }

UnaryExpr = { Operator ~ Term }

BinaryExpr = { Term ~ (Operator ~ Term)* }

Operator = { "+" | "-" | "*" | "^" }

Int = @{ Operator? ~ ASCII_DIGIT+ }

WHITESPACE = _{ " " | "\t" }

EOF = _{ EOI | ";" }

And the following expressions are all parsed correctly:

1 + 2   
1 - 2    
1 + -2   
1 - -2   
1        
+1       
-1

But any expression that begins with a negative number errors

-1 + 2

errors with

  --> 1:4
  |
1 | -1 + 2
  |    ^---
  |
  = expected EOI

What I expect (what I would like) is for -1 + 2 to be treated the same as 1 + -2, that is a Binary expression that is made up of two Unary Expressions.

I have toyed around with a lot of variations with no success. And, I'm open to using an entirely different paradigm if I need to, but I'd really like to keep the UnaryExpression idea since I've already built my parser around it.

I'm new to PEG, so I'd appreciate any help.

For what its worth, I'm using Rust v1.59 and https://pest.rs/ to both parse and test my expressions.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

帅的被狗咬 2025-01-23 06:14:29

Expr 逻辑中有一个小错误。如果两者匹配,则 | 之前的第一部分优先。
并且 -1 是有效的 UnaryExpr,因此在这种情况下,整个程序预计与 SOI ~ UnaryExpr ~ EOF 匹配。但还有其他数据 (+ 2) 会导致此错误。

如果您反转 Expr 的可能性,则 Expr = { BinaryExpr | UnaryExpr } 该示例有效。原因是,只有在 UnaryExpr 失败时才会检查第一个 BinaryExpr

You have a small error in the Expr logic. The first part before the | takes precedence if both match.
And -1 is a valid UnaryExpr so the program as a whole is expected to match SOI ~ UnaryExpr ~ EOF in this case. But there is additional data (+ 2) which leads to this error.

If you reverse the possibilities of Expr so that Expr = { BinaryExpr | UnaryExpr } the example works. The reason for that is that first BinaryExpr will be checked and only if that fails UnaryExpr.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文