PyParsing 中的简单递归下降
我尝试使用此代码 并将其转换为我正在从事的编程语言处理项目的内容,但我遇到了简化版本的问题:
op = oneOf( '+ - / *')
lparen, rparen = Literal('('), Literal(')')
expr = Forward()
expr << ( Word(nums) | ( expr + op + expr ) | ( lparen + expr + rparen) )
我已经对这个简单的设置进行了许多不同的修改。通常,尝试类似:
print(expr.parseString('1+2'))
将返回['1']
。当我陷入深度递归时,我会遇到这样的问题:
print(expr.parseString('(1+2)'))
关于简单递归,我缺少什么,我无法解析任意算术表达式,例如 1+(2 * 3-(4*(5+6) -(7))...
?
I've tried taking this code and converting it to something for a project I'm working on for programming language processing, but I'm running into an issue with a simplified version:
op = oneOf( '+ - / *')
lparen, rparen = Literal('('), Literal(')')
expr = Forward()
expr << ( Word(nums) | ( expr + op + expr ) | ( lparen + expr + rparen) )
I've played around with a number of different modifications of this simple setup. Usually, trying something like:
print(expr.parseString('1+2'))
Will return ['1']
. While I get caught in deep recursion with something like:
print(expr.parseString('(1+2)'))
What am I missing with respect to simple recursion that I can't parse arbitrarily arithmetic expressions, such as 1+(2 * 3-(4*(5+6)-(7))...
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
哇,我想 pyparsing 真的很受欢迎!感谢亚历克斯和约翰介入解决这个问题。你们的回答都切中要害。但让我添加一两条评论:
如果我们抑制左括号和右括号符号,并使用 Group 对括号表达式进行分组,pyparsing 将得到一个更接近 AST 的结构化结果。
给予:
<前><代码>(9 + 3) -> [[‘9’、‘+’、‘3’]]
(9 + 3) * (4 / 5) -> [['9', '+', '3'], '*', ['4', '/', '5']]
否则,pyparsing 只是标记化,您必须遍历已解析标记的列表来查找嵌套表达式。
由于 op 被定义为 oneOf("+ - * /"),因此操作没有优先级。 pyparsing 存储库上有示例 https://github.com/pyparsing/pyparsing/ tree/master/examples 手动定义此方法 (fourFn.py),或者使用
infixNotation
帮助程序 (simpleArith.py) 的最新方法。同样,这使 pyparsing 比仅仅标记化添加了更多价值。对于OP,请查看这些示例,我认为它们将帮助您推进项目。
——保罗
Wow, I guess pyparsing is really on the map! Thanks Alex and John for stepping in on this question. You are both on the mark with your responses. But let me add a comment or two:
If we suppress the opening and closing parenthesis symbols, and group the parenthesized expression using Group, pyparsing will a structured result that is closer to an AST.
Giving:
Otherwise, pyparsing is just tokenizing, and you have to walk the list of parsed tokens to find the nested expressions.
Since op is defined as just oneOf("+ - * /"), there is no precedence of operations. There are examples on the pyparsing repo at https://github.com/pyparsing/pyparsing/tree/master/examples of the manual way to define this (fourFn.py), or the more recent approach using the
infixNotation
helper (simpleArith.py). Again, this has pyparsing adding more value than just tokenizing.To the OP, please check out those examples, I think they will help move you forward on your project.
-- Paul
这或多或少是你想要的......?
发射
?这通过将“原子”(数字或括号内的表达式)与“表达式”(一个或多个带有运算符的“原子”)分开来“锚定”递归。
Is this more or less what you want...?
emitting
? This "anchors" the recursion by separating an "atom" (number or parenthesized expression) from an "expression" (one or more "atoms" with operators in-between).
像这样的语法
很难使用,因为递归总是向左潜入。
正常的算术语法看起来像这样:
基本上,你永远不会得到
S :: S
;每当非终结符出现在语法中一行的左侧和右侧时,中间必须有一些文字供解析器使用。A grammar like:
is hard to work with because the recursion just keeps diving into the left.
A normal arithmetic grammar would look something like:
Basically, you never get
S :: S
; any time a nonterminal appears on the left and right hand sides of a line in the grammar, there must be some literal in the middle for the parser to consume.使用
operatorPrecedence
构建表达式。它将构建正确的表达式,并在执行时处理运算符优先级:示例:
Use
operatorPrecedence
to build expressions. It'll build the correct expressions, and take care of operator precedence while at it:example: