动态(?)解析器

发布于 2024-07-06 18:08:32 字数 527 浏览 8 评论 0原文

是否存在在运行时生成 AST/解析树的解析器? 有点像一个接受一串 EBNF 语法或类似内容并吐出数据结构的库?

  • 我知道 antlr、jlex 及其同类。 他们生成可以做到这一点的源代码。 (喜欢跳过编译步骤)
  • 我知道 Boost::Spirit,它使用一些带有 C++ 语法的黑魔法在执行时生成这样的东西(绝对更接近我想要的,但当它对于 C++ 来说,它仍然有一定的限制,因为你的语法是硬编码的)
  • 我不知道 python 或 ruby​​ 中的任何内容,尽管编译器编译器在这种语言中可能非常有效......

现在我知道解析器组合器了。 (谢谢乔纳斯)和一些图书馆(谢谢埃利本)

顺便说一句,我最近还注意到 解析表达式语法,如果有人实现它,这听起来很酷(他们说Perl 6 会有它,但 Perl 回避了我的理解)

Does there exist a parser that generates an AST/parse tree at runtime? Kind of like a library that would accept a string of EBNF grammar or something analogous and spit out a data structure?

  • I'm aware of antlr, jlex and their ilk. They generate source code which could do this. (like to skip the compile step)
  • I'm aware of Boost::Spirit, which uses some black magic with C++ syntax to generate such things at execution time (definitely much closer to what I want, but I'm a wuss when it comes to C++. And it's still somewhat limiting, because your grammar is hardcoded)
  • I'm not aware of anything in python or ruby, although a compiler compiler might very well be effective in such a language...

Now I'm aware of parser combinators. (thanks, Jonas) And some libraries (thanks eliben)

incidentally, I also noticed Parsing Expression Grammars lately, which sounds cool were someone to implement it (they say Perl 6 will have it, but Perl evades my understanding)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

稍尽春風 2024-07-13 18:08:32

看看解析器组合器,我认为这可能对您有帮助。 使用这种技术可以在运行时创建解析器。 一种流行的解析器组合器是 Parsec 它使用 Haskell 作为其宿主语言。 来自 parsec 文档

组合器解析器是用与程序其余部分相同的编程语言编写和使用的。 语法形式(Yacc)和实际使用的编程语言(C)之间没有任何差距

解析器是语言中的一流值。 它们可以放入列表中,作为参数传递并作为值返回。 使用针对特定问题的定制解析器可以轻松扩展可用的解析器集

如果您使用 .NET,请查看 F# 的解析器组合器库

Take a look at parser combinators which i think may help you. It is possible to make parsers at runtime using this technique. One popular parser combinator is Parsec which uses Haskell as its host language. From the parsec documentation:

Combinator parsers are written and used within the same programming language as the rest of the program. There is no gap between the grammar formalism (Yacc) and the actual programming language used (C)

Parsers are first-class values within the language. They can be put into lists, passed as parameters and returned as values. It is easy extend the available set of parsers with custom made parsers specific for a certain problem

If you are using .NET take a look at the parser combinator library for F#.

欢烬 2024-07-13 18:08:32

如果 Java 更适合您,可以使用 Haskell Parsec 库的移植版 - JParsec。 非常强大,尽管文档不是很好。

您可以强制它执行直接的词法分析阶段,但您可以使用动态词法分析和动态语法做一些有趣的事情。

令人头晕的东西。

因为它都是用 Java 编写的(您的解析器是 POJO),所以您可以重构、执行 TDD 以及您习惯在 Java 中执行的任何操作。 这是更传统的 ANTLR/JavaCC/JJTree 方法的主要优势。

If Java is better for you, there is a port of the Haskell Parsec library - JParsec. Very powerful, though documentation isn't great.

You can coerce it to do a straight forward lex then parse phase, but you can do some interesting things with dynamic lexing and dynamic grammars.

Head twisting stuff.

Because it's all in Java (your Parser is a POJO), you can refactor, and do TDD, and whatever you're used to doing in Java. This is a major advantage to a more traditional ANTLR/JavaCC/JJTree approach.

○闲身 2024-07-13 18:08:32

Lambda the Ultimate 讨论了允许语法扩展的解析器

我计划编写一个允许语法扩展(某种编译时元编程)的编译器。 我不想拥有一个非常强大的系统,所以我考虑过:

{语法:while(条件)执行代码} 
  while (条件, 代码) =>   // 实际执行 
  

并将与语法匹配的每个模式替换为对函数的调用。
但是,我不知道从哪里开始让词法分析器和解析器运行,因为 Flex/Bison 或 ANTLR(我想用 C# 编写编译器)等常用工具似乎不允许这样做。

您能为我提供下一步该去哪里的指导吗? 我还了解到,Scheme 或 Haskell 可能是完成此任务的更好语言。 当然,我愿意接受任何有关实现它们的实际想法的建议。

Lambda the Ultimate discussed a parser that allows syntax extensions.

I'm projecting writing a compiler that would allow syntax extensions (some kind of compile-time metaprogramming). I don't want to have a very powerful system, so I've thought about just having:

{syntax: while (condition) do code}
while (condition, code) => // actual execution

and replace every pattern that matches the syntax with a call to the function.
However, I don't know where to start to get the lexer and parser running, because usual tools such as Flex/Bison or ANTLR (I would like to write the compiler in C#) don't seem to allow this.

Could you provide me any direction on where to go next? I've also read that Scheme or Haskell could be better languages to achieve this task. And of course, I'm open to any suggestion about the actual idea to implement them.

难以启齿的温柔 2024-07-13 18:08:32

是的当然 !

在所有动态语言中,这很容易实现,因为可以在运行时轻松生成和评估代码。 我会推荐两种替代方案:

  • 在 Perl 中,使用 Parse::RecDescent。 它从字符串中获取语法,并且您绝对可以要求它在运行时从新字符串生成新的解析器。
  • 在 Python 中,请考虑 PLY。 您可以在运行时轻松生成带有文档字符串的函数并在其上运行 PLY。

我个人推荐 Python 选项,但如果您了解 Perl 但不懂 Python,它可能不相关。

为了完整起见,我必须指出,您可以使用 Lex & 来完成此操作。 Yacc也是如此,但它有毛。 您必须在运行时根据语法生成 Lex / Yacc 文件,编译为 C,将其编译为共享库并在运行时加载它。 这听起来像是科幻小说,但有些工具实际上是为了满足效率和动态性的复杂需求而这样做的。

祝你好运。

Yes, of course !

In all the dynamic languages, this is very simple to achieve, because code can easily be generated and evaluated at runtime. I will recommend two alternatives:

  • In Perl, use Parse::RecDescent. It takes its grammar from a string, and you can definitely ask it to generate a new parser from a new string in runtime.
  • In Python, consider PLY. You can easily generate the functions with docstrings at runtime and run PLY on it.

I personally recommend the Python option, though it may not be relevant if you know Perl but not Python.

For completeness, I must note that you can do it with Lex & Yacc as well, but it's hairy. You'll have to generate a Lex / Yacc file from your grammar at runtime, compile into C, compile that into a shared lib and load it at runtime. This sounds like science fiction, but some tools actually do this for complex needs of efficiency and dynamicity.

Good luck.

静赏你的温柔 2024-07-13 18:08:32

JFlex,JLex Java 扩展,允许您进行运行时编译,但它是相当复杂的东西。

JFlex, the JLex Java extension, lets you do run time compilation, but it is pretty hairy stuff.

铜锣湾横着走 2024-07-13 18:08:32

你要解析什么? 在 C 或 C++ 中,运行时不会有解析器,因此如果没有附加库,它就无法使用。 对于许多编程语言来说这是事实。

当您实现它们时,所有解析器默认都是“动态”的。 即使是 C 语言。

如果您要解析的语言是您自己的语言:编写解析器本身就是一件需要学习的事情。 即使使用解析器生成器,它本身也是一项工作。 当你学会了之后,它就会变得非常简单。 不过,诸如缩进语法之类的特殊技巧仍然很棘手,并且您将需要良好且广泛的测试来查看解析器是否执行您想要的操作。 我已经编写了一个解析器,所以我知道。

What are you going to parse? In C or C++ you won't have a parser in runtime, therefore it's not available without an additional library. For many programming languages this is true though.

All parsers are by default 'dynamic' when you implement them. Even in C.

If the language you are going to parse is your own: writing parsers is a thing to learn on its own. Even with parser generators it's a work in itself. After you've learned it though, it'll become pretty simple though. Special tricks like indented syntax will still be tricky though, and you will require good and extensive tests to see that the parser does what you want. I've written a parser so I know.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文