PEG 的空生产有什么作用？

发布于 2024-11-05 13:10:03 字数 165 浏览 3 评论 0原文

空产生式规则

nonterminal -> epsilon

在 lex-yacc LR 自底向上解析器生成器（例如 PLY）中很有用。

在什么情况下应该在 PEG 解析器中使用 Empty 产生式，例如 pyparsing ？

原文

The empty production rule

nonterminal -> epsilon

is useful in lex-yacc LR bottom up parser generators (e.g. PLY).

In what context should one use Empty productions in PEG parsers e.g. pyparsing ?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

昔梦 2024-11-12 13:10:03

BNF 经常使用空作为替代，有效地使整个表达式可选：

leading_sign ::= + | - | empty
integer ::= leading_sign digit...

这在 pyparsing 中是不必要的，因为 pyparsing 包含为此的可选类：

# no empty required
leading_sign = Optional(oneOf("+ -"))
integer = leading_sign + Word(nums)

尽管如此，空确实对于某些特定于 pyparsing 的目的很有用：

跳过空白 - 中的一些元素pyparsing 在开始解析之前不会跳过空格，例如 CharsNotIn 和 restOfLine。如果您有一个简单的键值条目输入，其中键是带引号的字符串，值是带引号的字符串后面的所有内容，如下所示：

"Key 1" value of Key 1
"Key 2" value of Key 2

将其定义为：

quotedString + restOfLine

将为您提供“键 1 的值”和“键 1 的值”键 2" 作为值。 Pyparsing 的空确实会跳过空格，因此将语法更改为：

quotedString + empty + restOfLine

将为您提供没有前导空格的值。

在特定位置激活解析操作 - 我使用empty作为在originalTextFor中生成的表达式的一部分来放置开始和结束位置标记。空的解析操作将它们替换为其位置值，然后 OriginalTextFor 的解析操作使用这些位置来从输入字符串中分割原始文本。

小心使用空。空总是匹配，但从不推进解析位置（跳过空格除外）。所以：

OneOrMore(empty)

将是一个无限循环。

empty | "A" | "B" | "C"

由于 MatchFirsts 短路，因此永远不会匹配任何非空替代项。

BNF's often use empty as an alternative, effectively making the overall expression optional:

leading_sign ::= + | - | empty
integer ::= leading_sign digit...

This is unnecessary in pyparsing, since pyparsing includes the Optional class for this:

# no empty required
leading_sign = Optional(oneOf("+ -"))
integer = leading_sign + Word(nums)

Empty does come in handy for some pyparsing-specific purposes though:

Skips over whitespace - some elements in pyparsing do not skip over whitespace before starting their parse, such as CharsNotIn and restOfLine. If you had a simple input of key-value entries, in which the key was a quoted string and the value was everything after the quoted string, like this:

"Key 1" value of Key 1
"Key 2" value of Key 2

Defining this as:

quotedString + restOfLine

would give you " value of Key 1" and " value of Key 2" as the values. Pyparsing's empty does skip over whitespace, so changing the grammar to:

quotedString + empty + restOfLine

will give you values without the leading spaces.

Activating parse actions at specific places - I used empty's as part of the generated expression in originalTextFor to drop in start and end location markers. The parse actions for the empty's replace them with their location values, then the parse action for originalTextFor uses those locations to slice the original text from the input string.

Be careful using empty. empty always matches, but never advances the parse location (except for skipping whitespace). So: