用于类 Python 语言的 Python 解析器

发布于 2025-01-08 19:57:10 字数 611 浏览 0 评论 0原文

我正在寻找为源文件编写一个 Python 导入过滤器或预处理器,这些源文件本质上是带有额外语言元素的 Python。目标是读取源文件,将其解析为抽象语法树,应用一些转换以实现该语言的新部分,并编写有效的 Python 源代码,然后可由 CPython 使用。我想用 Python 编写这个东西,并且正在寻找适合该任务的最佳解析器。

Python 中内置的解析器并不合适,因为它要求源文件是真正的 Python,但事实并非如此。有大量的解析器(或解析器生成器)可以与 Python 一起使用,但如果没有大量的研究,很难判断哪个最适合我的需求。

总之,我的要求是:

  1. 解析器是用 Python 编写的或具有 Python 绑定。
  2. 附带一个我可以调整的 Python 语法,或者可以轻松使用其他地方可用的可调整的 Python 语法(例如 http://docs.python.org/reference/grammar.html)。
  3. 转换后可以重新序列化 AST。
  4. 使用 API 应该不会太可怕。

有什么建议吗?

I'm looking to write a Python import filter or preprocessor for source files that are essentially Python with extra language elements. The goal is to read the source file, parse it to an abstract syntax tree, apply some transforms in order to implement the new parts of the language, and write valid Python source which can then be consumed by CPython. I want to write this thing in Python and am looking for the best parser for the task.

The parser built in to Python is not appropriate because it requires the source files be actual Python, which these will not be. There are tons of parsers (or parser generators) that will work with Python, but it's hard to tell which is the best for my needs without a whole bunch of research.

In summary, my requirements are:

  1. Parser is written in Python or has Python bindings.
  2. Comes with a Python grammar that I can tweak, or can easily consume a tweakable Python grammar available elsewhere (such as http://docs.python.org/reference/grammar.html).
  3. Can re-serialize the AST after transforming it.
  4. Should not be too horrific to work with API-wise.

Any suggestions?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

北方。的韩爷 2025-01-15 19:57:10

我首先想到的是 lib2to3。它是 Python 解析器的完整纯 Python 实现。它读取一个Python语法文件,并根据这个语法解析Python源文件。它为执行 AST 操作和写回格式良好的 Python 代码提供了一个很好的基础设施——毕竟它的目的是在两种语法略有不同的类似 Python 的语言之间进行转换。

不幸的是它缺乏文档并且不能保证稳定的界面。尽管如此,还是有一些项目构建在 lib2to3 之上,并且 源代码非常可读。如果API稳定性是一个问题,你可以直接分叉它。

The first thing that comes to mind is lib2to3. It is a complete pure-Python implementation of a Python parser. It reads a Python grammar file and parses Python source files according to this grammar. It offers a great infrastructure for performing AST manipulations and writing back nicely formatted Python code -- after all it's purpose is to transform between two Python-like languages with slightly different grammars.

Unfortunately it's lacking documentation and doesn't guarantee a stable interface. There are projects that build on top of lib2to3 nevertheless, and the source code is quite readable. If API stability is an issue, you can just fork it.

零崎曲识 2025-01-15 19:57:10

我建议您查看我的库: https://github.com/erezsh/lark

它可以解析所有上下文无关语法,自动构建 AST(带有行号和列号),并接受 EBNF 格式的语法,这被认为是标准。

它可以轻松解析像 Python 这样的语言,并且比任何其他用 Python 编写的解析库更快。

事实上,已经有一个 Python 语法示例解析器

I would recommend that you check out my library: https://github.com/erezsh/lark

It can parse ALL context-free grammars, automatically builds an AST (with line & column numbers), and accepts the grammar in EBNF format, which is considered the standard.

It can easily parse a language like Python, and it can do so faster than any other parsing library written in Python.

In fact, there's already an example python grammar and parser

子栖 2025-01-15 19:57:10

我非常喜欢 SimpleParse ,但我从未尝试向它提供 Python 语法 (顺便说一句,它是确定性语法吗?)。如果它被阻塞,PLY 将会完成这项工作。

请参阅有关 Python 解析工具的编译

I like SimpleParse a lot, but I never tried to feed it the Python grammar (BTW, is it a deterministic grammar?). If it chokes, PLY will do the job.

See this compilation about Python parsing tools.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文