PEG语法的局限性解析器生成器?

发布于 2024-08-14 07:09:21 字数 1436 浏览 9 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

落花浅忆 2024-08-21 07:09:21

我认为 PEG 的一个大“问题”是它们不符合正常的语法分类,因为它们的运作方式根本不同。普通语法是“向后”的,因为它们描述了可以生成的所有可能的句子(程序)。 PEG 描述了如何解析——它们从另一端解决问题。

在我看来,这是一种更自然的思考问题的方式,当然对于任何手写(递归下降)解析器我不会做任何其他事情。

I think the big "problem" with PEGs is that they don't fit into the normal taxonomy of grammars as they operate in a fundamentally different way. Normal grammars are "backwards" in the sense that they describe all the possible sentences (programs) that can be generated. PEGs describe how to parse--they come at the problem from the other end.

In my view this is a more natural way to think about the problem, and certainly for any hand-written (recursive-descent) parser I wouldn't do anything else.

时光无声 2024-08-21 07:09:21

PEG 语法的主要限制是它们根本不处理歧义。

可以肯定的是,这也是他们的优势,因为处理歧义是使用 CFG(上下文无关语法)工具最令人沮丧的部分之一。

使用 PEG,您可以通过将您想要匹配的规则排序在另一个会模糊匹配但您不想要的规则之前来明确处理歧义。

问题是,你并不总是知道语言或语法和 PEG 生成器中的一些甚至任何歧义,至少是我尝试过的,不要分析语法中的歧义来帮助你找到然后设计并制定规则以正确的方式处理它们。

CFG 解析器生成器(如 yacc 和 bison)会分析您的语法并报告所有歧义。不幸的是,他们经常以一种非常神秘的方式报告这些问题,很难理解。当然,通常很难修复语法来处理它们。但至少你会意识到它们的存在。

使用 PEG 语法,您可以幸福地忽略概念语法中的歧义性,因为一旦您将其设为 PEG,它就不再具有歧义性,它只是具有匹配规则,并且可能是静默无法访问的规则,如果它们具有更高的规则,这些规则也将匹配优先。这些可能不会出现在您的测试中,但可能会在发布后出现。

使用 CFG 语法,您在开发过程中被迫处理歧义,但这并不容易。


如果我没有说清楚,这里是 Joshua Haberman 在Lambda the Ultimate编程语言博客上进行的六年前的讨论:PEG 和 Packrat 解析不是答案 (archive.org)。

The main limitation of PEG grammars is that they don't deal with ambiguity at all.

To be sure, this is also their strength since dealing with ambiguities is one of the most frustrating parts of using a CFG (context free grammar) tool.

With PEGs you deal with ambiguities explicitly by ordering the rule you want to match before another rule that would match ambiguously but which you don't want.

The problem is that you don't always even know about some or even any of the ambiguities in a language or a grammar and PEG generators, at least the ones I've tried, don't analyse the grammar for ambiguity to help you find them and then design and order your rules to deal with them the right way.

CFG parser generators like yacc and bison analyse your grammar and report all the ambiguities. Unfortunately they often report them in a pretty cryptic way that can be hard to make sense of. And of course it's often hard to fix the grammar to deal with them. But at least you will be aware that they exist.

With a PEG grammar you can be blissfully ignorant of the ambiguities in your conceptual grammar because once you make it a PEG it doesn't have ambiguities any more, it just has matching rules and maybe silently unreachable rules which would also match if they had higher precedence. These might not show up in your testing but may show up after release.

With CFG grammars you are forced to deal with ambiguities during development, but it won't be easy.


In the event I'm not making it clear, here's a six-year-old discussion by Joshua Haberman over on the Lambda the Ultimate programming languages blog: PEGs and Packrat Parsing are not the answer (archive.org).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文