如何手动编写复杂的公式解析器?
嗯,这是与语言无关的,我更喜欢用 C# 或 F# 来做,但这次我对“无论如何如何工作”这个问题更感兴趣。
我想要完成的是:
a)我想学习它 - 这次是关于我的自我,这是一个有趣的项目,我想向自己展示我真的很擅长这个东西
b)我知道一点点关于 EBNF 的一些信息(虽然我还不知道 EBNF 中的运算符优先级如何工作 - Irony.NET 做得对,我检查了示例,但这对我来说有点不祥)
c)我的解析器应该能够接受这个: 5 * (3 + (2 - 9 * (5 / 7)) + 9) 例如,给我正确的结果
d) 坦率地说,这似乎是编写编译器甚至解释器时最大的问题为我。我什至可以毫无问题地生成 64 位汇编代码(我可以手动编写汇编程序),但公式解析器...
e) 另一种想法:即使是简单的计算机(例如我的旧 Sharp 1246S,只有大约 2kB RAM)也可以做到这一点...这不可能那么难,对吧?甚至非常非常古老的编程语言也有公式计算... BASIC 是从 1964 年开始的,它们已经可以计算我作为示例提供的那种公式
f) 一些想法、一些灵感就足够了 - 我只是没有提示如何执行运算符优先级和括号 - 但是,我知道它涉及 AST 并且许多人使用堆栈
那么,您怎么看?
Hm, this is language - agnostic, I would prefer doing it in C# or F#, but I'm more interested this time in the question "how would that work anyway".
What I want to accomplish ist:
a) I want to LEARN it - it's about my ego this time, it's for a fun project where I want to show myself that I'm a really good at this stuff
b) I know a tiny little bit about EBNF (although I don't know yet, how operator precedence works in EBNF - Irony.NET does it right, I checked the examples, but this is a bit ominous to me)
c) My parser should be able to take this: 5 * (3 + (2 - 9 * (5 / 7)) + 9) for example and give me the right results
d) To be quite frankly, this seems to be the biggest problem in writing a compiler or even an interpreter for me. I would have no problem generating even 64 bit assembler code (I CAN write assembler manually), but the formula parser...
e) Another thought: even simple computers (like my old Sharp 1246S with only about 2kB of RAM) can do that... it can't be THAT hard, right? And even very, very old programming languages have formula evaluation... BASIC is from 1964 and they already could calculate the kind of formula I presented as an example
f) A few ideas, a few inspirations would be really enough - I just have no clue how to do operator precedence and the parentheses - I DO, however, know that it involves an AST and that many people use a stack
So, what do you think?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您应该去了解递归下降解析器。
查看 Code Golf 练习,以 10 种不同的方式执行此操作:
Code Golf:数学表达式求值器(尊重 PEMDAS)
其中一些“高尔夫”解决方案是递归下降解析器,只是以不同的方式编码。
您会发现,仅进行表达式解析是编译器中迄今为止最简单的事情。解析语言的其余部分比较困难,但理解代码元素如何交互以及如何生成良好的代码则要困难得多。
您可能还对如何使用 BNF 表达解析器以及如何使用该 BNF 执行某些操作感兴趣。这是
如何使用显式 BNF 和符号解析和操作代数的示例隐式 AST 作为基础。这不是编译器传统上所做的事情,但所做的工作是深深植根于编译器技术的。
You should go learn about Recursive Descent parsers.
Check out a Code Golf exercise in doing just this, 10 different ways:
Code Golf: Mathematical expression evaluator (that respects PEMDAS)
Several of these "golf" solutions are recursive descent parsers just coded in different ways.
You'll find that doing just expression parsing is by far the easiest thing in a compiler. Parsing the rest of the language is harder, but understanding how the code elements interact and how to generate good code is far more difficult.
You may also be interested in how to express a parser using BNF, and how to do something with that BNF. Here's
an example of how to parse and manipulate algebra symbolically with an explicit BNF and an implicit AST as a foundation. This isn't what compilers traditionally do, but the machinery that does is founded deeply in compiler technology.
对于在 PHP 中实现的基于堆栈的解析器,该解析器使用 Djikstra 的分流码算法将中缀转换为后缀表示法,并且支持具有不同数量参数的函数,您可以查看 PHPExcel计算引擎
For a stack-based parser implemented in PHP that uses Djikstra's shunting yard algorithm to convert infix to postfix notation, and with support for functions with varying number of arguments, you can look at the source for the PHPExcel calculation engine
传统上,计算机上的公式处理器使用 POSTFIX 表示法。他们使用堆栈,弹出 2 个项目作为操作数,弹出第三个项目作为运算符,然后压入结果。
您想要的是一个 INFIX 到 POSTFIX 表示法转换器,这确实非常简单。一旦您进入后缀处理,这就是您所做的最简单的事情。
Traditionally formula processors on computers use POSTFIX notation. They use a stack, pop 2 items as operands, pop the third item as the operator, and push the result.
What you want is an INFIX to POSTFIX notation converter which is really quite simple. Once you're in postfix processing is the simplest thing you'll ever do.
如果您想采用现有的解决方案,我可以推荐一个工作、与 PSR-0 兼容的调车场算法实现:https://github.com/andig/php-shunting-yard/tree/dev。
If you want to go for an existing solution I can recommend a working, PSR-0 compatible implementation of the shunting yard algorithm: https://github.com/andig/php-shunting-yard/tree/dev.