用于类 Python 语言的 Python 解析器
我正在寻找为源文件编写一个 Python 导入过滤器或预处理器,这些源文件本质上是带有额外语言元素的 Python。目标是读取源文件,将其解析为抽象语法树,应用一些转换以实现该语言的新部分,并编写有效的 Python 源代码,然后可由 CPython 使用。我想用 Python 编写这个东西,并且正在寻找适合该任务的最佳解析器。
Python 中内置的解析器并不合适,因为它要求源文件是真正的 Python,但事实并非如此。有大量的解析器(或解析器生成器)可以与 Python 一起使用,但如果没有大量的研究,很难判断哪个最适合我的需求。
总之,我的要求是:
- 解析器是用 Python 编写的或具有 Python 绑定。
- 附带一个我可以调整的 Python 语法,或者可以轻松使用其他地方可用的可调整的 Python 语法(例如 http://docs.python.org/reference/grammar.html)。
- 转换后可以重新序列化 AST。
- 使用 API 应该不会太可怕。
有什么建议吗?
I'm looking to write a Python import filter or preprocessor for source files that are essentially Python with extra language elements. The goal is to read the source file, parse it to an abstract syntax tree, apply some transforms in order to implement the new parts of the language, and write valid Python source which can then be consumed by CPython. I want to write this thing in Python and am looking for the best parser for the task.
The parser built in to Python is not appropriate because it requires the source files be actual Python, which these will not be. There are tons of parsers (or parser generators) that will work with Python, but it's hard to tell which is the best for my needs without a whole bunch of research.
In summary, my requirements are:
- Parser is written in Python or has Python bindings.
- Comes with a Python grammar that I can tweak, or can easily consume a tweakable Python grammar available elsewhere (such as http://docs.python.org/reference/grammar.html).
- Can re-serialize the AST after transforming it.
- Should not be too horrific to work with API-wise.
Any suggestions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我首先想到的是
lib2to3
。它是 Python 解析器的完整纯 Python 实现。它读取一个Python语法文件,并根据这个语法解析Python源文件。它为执行 AST 操作和写回格式良好的 Python 代码提供了一个很好的基础设施——毕竟它的目的是在两种语法略有不同的类似 Python 的语言之间进行转换。不幸的是它缺乏文档并且不能保证稳定的界面。尽管如此,还是有一些项目构建在
lib2to3
之上,并且 源代码非常可读。如果API稳定性是一个问题,你可以直接分叉它。The first thing that comes to mind is
lib2to3
. It is a complete pure-Python implementation of a Python parser. It reads a Python grammar file and parses Python source files according to this grammar. It offers a great infrastructure for performing AST manipulations and writing back nicely formatted Python code -- after all it's purpose is to transform between two Python-like languages with slightly different grammars.Unfortunately it's lacking documentation and doesn't guarantee a stable interface. There are projects that build on top of
lib2to3
nevertheless, and the source code is quite readable. If API stability is an issue, you can just fork it.我建议您查看我的库: https://github.com/erezsh/lark
它可以解析所有上下文无关语法,自动构建 AST(带有行号和列号),并接受 EBNF 格式的语法,这被认为是标准。
它可以轻松解析像 Python 这样的语言,并且比任何其他用 Python 编写的解析库更快。
事实上,已经有一个 Python 语法示例和 解析器
I would recommend that you check out my library: https://github.com/erezsh/lark
It can parse ALL context-free grammars, automatically builds an AST (with line & column numbers), and accepts the grammar in EBNF format, which is considered the standard.
It can easily parse a language like Python, and it can do so faster than any other parsing library written in Python.
In fact, there's already an example python grammar and parser
我非常喜欢 SimpleParse ,但我从未尝试向它提供 Python 语法 (顺便说一句,它是确定性语法吗?)。如果它被阻塞,PLY 将会完成这项工作。
请参阅有关 Python 解析工具的编译。
I like SimpleParse a lot, but I never tried to feed it the Python grammar (BTW, is it a deterministic grammar?). If it chokes, PLY will do the job.
See this compilation about Python parsing tools.