动态(?)解析器
是否存在在运行时生成 AST/解析树的解析器? 有点像一个接受一串 EBNF 语法或类似内容并吐出数据结构的库?
- 我知道 antlr、jlex 及其同类。 他们生成可以做到这一点的源代码。 (喜欢跳过编译步骤)
- 我知道 Boost::Spirit,它使用一些带有 C++ 语法的黑魔法在执行时生成这样的东西(绝对更接近我想要的,但当它对于 C++ 来说,它仍然有一定的限制,因为你的语法是硬编码的)
- 我不知道 python 或 ruby 中的任何内容,尽管编译器编译器在这种语言中可能非常有效......
现在我知道解析器组合器了。 (谢谢乔纳斯)和一些图书馆(谢谢埃利本)
顺便说一句,我最近还注意到 解析表达式语法,如果有人实现它,这听起来很酷(他们说Perl 6 会有它,但 Perl 回避了我的理解)
Does there exist a parser that generates an AST/parse tree at runtime? Kind of like a library that would accept a string of EBNF grammar or something analogous and spit out a data structure?
- I'm aware of antlr, jlex and their ilk. They generate source code which could do this. (like to skip the compile step)
- I'm aware of Boost::Spirit, which uses some black magic with C++ syntax to generate such things at execution time (definitely much closer to what I want, but I'm a wuss when it comes to C++. And it's still somewhat limiting, because your grammar is hardcoded)
- I'm not aware of anything in python or ruby, although a compiler compiler might very well be effective in such a language...
Now I'm aware of parser combinators. (thanks, Jonas) And some libraries (thanks eliben)
incidentally, I also noticed Parsing Expression Grammars lately, which sounds cool were someone to implement it (they say Perl 6 will have it, but Perl evades my understanding)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
看看解析器组合器,我认为这可能对您有帮助。 使用这种技术可以在运行时创建解析器。 一种流行的解析器组合器是 Parsec 它使用 Haskell 作为其宿主语言。 来自 parsec 文档:
如果您使用 .NET,请查看 F# 的解析器组合器库。
Take a look at parser combinators which i think may help you. It is possible to make parsers at runtime using this technique. One popular parser combinator is Parsec which uses Haskell as its host language. From the parsec documentation:
If you are using .NET take a look at the parser combinator library for F#.
如果 Java 更适合您,可以使用 Haskell Parsec 库的移植版 - JParsec。 非常强大,尽管文档不是很好。
您可以强制它执行直接的词法分析阶段,但您可以使用动态词法分析和动态语法做一些有趣的事情。
令人头晕的东西。
因为它都是用 Java 编写的(您的解析器是 POJO),所以您可以重构、执行 TDD 以及您习惯在 Java 中执行的任何操作。 这是更传统的 ANTLR/JavaCC/JJTree 方法的主要优势。
If Java is better for you, there is a port of the Haskell Parsec library - JParsec. Very powerful, though documentation isn't great.
You can coerce it to do a straight forward lex then parse phase, but you can do some interesting things with dynamic lexing and dynamic grammars.
Head twisting stuff.
Because it's all in Java (your Parser is a POJO), you can refactor, and do TDD, and whatever you're used to doing in Java. This is a major advantage to a more traditional ANTLR/JavaCC/JJTree approach.
Lambda the Ultimate 讨论了允许语法扩展的解析器。
Lambda the Ultimate discussed a parser that allows syntax extensions.
是的当然 !
在所有动态语言中,这很容易实现,因为可以在运行时轻松生成和评估代码。 我会推荐两种替代方案:
我个人推荐 Python 选项,但如果您了解 Perl 但不懂 Python,它可能不相关。
为了完整起见,我必须指出,您可以使用 Lex & 来完成此操作。 Yacc也是如此,但它有毛。 您必须在运行时根据语法生成 Lex / Yacc 文件,编译为 C,将其编译为共享库并在运行时加载它。 这听起来像是科幻小说,但有些工具实际上是为了满足效率和动态性的复杂需求而这样做的。
祝你好运。
Yes, of course !
In all the dynamic languages, this is very simple to achieve, because code can easily be generated and evaluated at runtime. I will recommend two alternatives:
I personally recommend the Python option, though it may not be relevant if you know Perl but not Python.
For completeness, I must note that you can do it with Lex & Yacc as well, but it's hairy. You'll have to generate a Lex / Yacc file from your grammar at runtime, compile into C, compile that into a shared lib and load it at runtime. This sounds like science fiction, but some tools actually do this for complex needs of efficiency and dynamicity.
Good luck.
JFlex,JLex Java 扩展,允许您进行运行时编译,但它是相当复杂的东西。
JFlex, the JLex Java extension, lets you do run time compilation, but it is pretty hairy stuff.
你要解析什么? 在 C 或 C++ 中,运行时不会有解析器,因此如果没有附加库,它就无法使用。 对于许多编程语言来说这是事实。
当您实现它们时,所有解析器默认都是“动态”的。 即使是 C 语言。
如果您要解析的语言是您自己的语言:编写解析器本身就是一件需要学习的事情。 即使使用解析器生成器,它本身也是一项工作。 当你学会了之后,它就会变得非常简单。 不过,诸如缩进语法之类的特殊技巧仍然很棘手,并且您将需要良好且广泛的测试来查看解析器是否执行您想要的操作。 我已经编写了一个解析器,所以我知道。
What are you going to parse? In C or C++ you won't have a parser in runtime, therefore it's not available without an additional library. For many programming languages this is true though.
All parsers are by default 'dynamic' when you implement them. Even in C.
If the language you are going to parse is your own: writing parsers is a thing to learn on its own. Even with parser generators it's a work in itself. After you've learned it though, it'll become pretty simple though. Special tricks like indented syntax will still be tricky though, and you will require good and extensive tests to see that the parser does what you want. I've written a parser so I know.