pyparsing 匹配指定文字的任意组合

发布于 2024-09-12 07:14:36 字数 194 浏览 9 评论 0原文

例子：我有文字“alpha”、“beta”、“gamma”。如何使 pyparsing 解析以下输入：

alpha
alpha|beta
beta|alpha|gamma

可以使用给定集合中的一个或多个非重复文字（以“|”分隔）来构造给定输入。关于设置 pyparsing 的建议将不胜感激。

原文

Example:
I have the literals "alpha", "beta", "gamma". How do I make pyparsing parse the following inputs:

alpha
alpha|beta
beta|alpha|gamma

The given input can be constructed by using one or more non-repeating literals from a given set, separated by "|". Advice on setting up pyparsing will be appreciated.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

旧城烟雨 2024-09-19 07:14:36

使用“&” Each 的运算符，而不是“+”或“|”。如果您必须拥有全部，但顺序不可预测，请使用：

Literal('alpha') & 'beta' & 'gamma'

如果可能缺少某些内容，但每个最多使用一次，则使用可选：

Optional('alpha') & Optional('beta') & Optional('gamma')

哎呀，我忘记了“|”分隔符。一种宽松的解析器是使用 delimitedList：

delimitedList(oneOf("alpha beta gamma"), '|')

这将允许您的任何或所有选择，但不能防止重复。使用解析操作可能是最简单的：

itemlist = delimitedList(oneOf("alpha beta gamma"), '|')
def ensureNoDuplicates(tokens):
    if len(set(tokens)) != len(tokens):
        raise ParseException("duplicate list entries found")
itemlist.setParseAction(ensureNoDuplicates)

这对我来说是最简单的方法。

编辑：

最新版本的 pyparsing 引入了解析时条件，以使这种解析操作更容易编写：

itemlist = delimitedList(oneOf("alpha beta gamma"), '|')
itemlist.addCondition(lambda tokens: len(set(tokens)) == len(tokens),
                      "duplicate list entries found")

Use the '&' operator for Each, instead of '+ or '|'. If you must have all, but in unpredicatable order use:

Literal('alpha') & 'beta' & 'gamma'

If some may be missing, but each used at most once, then use Optionals:

Optional('alpha') & Optional('beta') & Optional('gamma')

Oops, I forgot the '|' delimiters. One lenient parser would be to use a delimitedList:

delimitedList(oneOf("alpha beta gamma"), '|')

This would allow any or all of your choices, but does not guard against duplicates. May be simplest to use a parse action:

itemlist = delimitedList(oneOf("alpha beta gamma"), '|')
def ensureNoDuplicates(tokens):
    if len(set(tokens)) != len(tokens):
        raise ParseException("duplicate list entries found")
itemlist.setParseAction(ensureNoDuplicates)

This feels like the simplest approach to me.

EDIT:

Recent versions of pyparsing have introduced parse-time conditions to make this kind of parse action easier to write:

itemlist = delimitedList(oneOf("alpha beta gamma"), '|')
itemlist.addCondition(lambda tokens: len(set(tokens)) == len(tokens),
                      "duplicate list entries found")

回复收藏 0 原文

~没有更多了~