PEG语法解析,表达式以负数开头时出错
我定义了以下 PEG 语法:
Program = _{ SOI ~ Expr ~ EOF }
Expr = { UnaryExpr | BinaryExpr }
Term = _{Int | "(" ~ Expr ~ ")" }
UnaryExpr = { Operator ~ Term }
BinaryExpr = { Term ~ (Operator ~ Term)* }
Operator = { "+" | "-" | "*" | "^" }
Int = @{ Operator? ~ ASCII_DIGIT+ }
WHITESPACE = _{ " " | "\t" }
EOF = _{ EOI | ";" }
并且以下表达式均已正确解析:
1 + 2
1 - 2
1 + -2
1 - -2
1
+1
-1
但任何以负数开头的表达式错误
-1 + 2
错误
--> 1:4
|
1 | -1 + 2
| ^---
|
= expected EOI
我所期望的(我想要的)是针对 -1 + 2
与 1 + -2
的处理方式相同,即由两个一元表达式组成的二元表达式。
我尝试过很多变体但没有成功。而且,如果需要的话,我愿意使用完全不同的范例,但我真的很想保留 UnaryExpression 的想法,因为我已经围绕它构建了我的解析器。
我是 PEG 新手,所以我将不胜感激。
就其价值而言,我使用 Rust v1.59 和 https://pest.rs/ 来解析并测试我的表达。
I have the following PEG grammar defined:
Program = _{ SOI ~ Expr ~ EOF }
Expr = { UnaryExpr | BinaryExpr }
Term = _{Int | "(" ~ Expr ~ ")" }
UnaryExpr = { Operator ~ Term }
BinaryExpr = { Term ~ (Operator ~ Term)* }
Operator = { "+" | "-" | "*" | "^" }
Int = @{ Operator? ~ ASCII_DIGIT+ }
WHITESPACE = _{ " " | "\t" }
EOF = _{ EOI | ";" }
And the following expressions are all parsed correctly:
1 + 2
1 - 2
1 + -2
1 - -2
1
+1
-1
But any expression that begins with a negative number errors
-1 + 2
errors with
--> 1:4
|
1 | -1 + 2
| ^---
|
= expected EOI
What I expect (what I would like) is for -1 + 2
to be treated the same as 1 + -2
, that is a Binary expression that is made up of two Unary Expressions.
I have toyed around with a lot of variations with no success. And, I'm open to using an entirely different paradigm if I need to, but I'd really like to keep the UnaryExpression idea since I've already built my parser around it.
I'm new to PEG, so I'd appreciate any help.
For what its worth, I'm using Rust v1.59 and https://pest.rs/ to both parse and test my expressions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Expr
逻辑中有一个小错误。如果两者匹配,则|
之前的第一部分优先。并且
-1
是有效的UnaryExpr
,因此在这种情况下,整个程序预计与SOI ~ UnaryExpr ~ EOF
匹配。但还有其他数据 (+ 2
) 会导致此错误。如果您反转
Expr
的可能性,则Expr = { BinaryExpr | UnaryExpr }
该示例有效。原因是,只有在UnaryExpr
失败时才会检查第一个BinaryExpr
。You have a small error in the
Expr
logic. The first part before the|
takes precedence if both match.And
-1
is a validUnaryExpr
so the program as a whole is expected to matchSOI ~ UnaryExpr ~ EOF
in this case. But there is additional data (+ 2
) which leads to this error.If you reverse the possibilities of
Expr
so thatExpr = { BinaryExpr | UnaryExpr }
the example works. The reason for that is that firstBinaryExpr
will be checked and only if that failsUnaryExpr
.