当前位置：文江博客话题详情

Python 模式匹配。匹配 'c[任意数量的连续 a、b、c 或 b、c、a 等。 ]t'

发布于 2024-11-19 15:19:59 字数 397 浏览 4 评论 0原文

对不起，我的标题，我无法想出一个干净的方式来问我的问题。

在Python中，我想匹配一个表达式'c[some stuff]t'，其中[some stuff]可以是任意数量的连续a、b或c并且以任意顺序。

例如，这些工作： 'ct'、'猫'、'cbbt'、'caaabbct'、'cbbccaat'

但这些不会： 'cbcbbaat'、'caaccbabbt'

编辑：a、b 和 c 只是一个示例，但我真的希望能够将其扩展到更多字母。我对正则表达式和非正则表达式解决方案感兴趣。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

薄荷港 2024-11-26 15:19:59

尚未经过彻底测试，但我认为这应该可行：

import re

words = ['ct', 'cat', 'cbbt', 'caaabbct', 'cbbccaat',  'cbcbbaat', 'caaccbabbt']
pat = re.compile(r'^c(?:([abc])\1*(?!.*\1))*t
这与 a、b 或 c 的运行相匹配（即 ([abc ])\1* 部分），而负前瞻 (?!.*\1) 确保运行后不存在该字符的其他实例。
 （编辑：修复了解释中的拼写错误）
)
for w in words:
    print w, "matches" if pat.match(w) else "doesn't match"

#ct matches
#cat matches
#cbbt matches
#caaabbct matches
#cbbccaat matches
#cbcbbaat doesn't match
#caaccbabbt doesn't match

这与 a、b 或 c 的运行相匹配（即 ([abc ])\1* 部分），而负前瞻 (?!.*\1) 确保运行后不存在该字符的其他实例。

（编辑：修复了解释中的拼写错误）

Not thoroughly tested, but I think this should work:

import re

words = ['ct', 'cat', 'cbbt', 'caaabbct', 'cbbccaat',  'cbcbbaat', 'caaccbabbt']
pat = re.compile(r'^c(?:([abc])\1*(?!.*\1))*t
This matches runs of a, b or c (that's the ([abc])\1* part), while the negative lookahead (?!.*\1) makes sure no other instance of that character is present after the run.
(edit: fixed a typo in the explanation)
)
for w in words:
    print w, "matches" if pat.match(w) else "doesn't match"

#ct matches
#cat matches
#cbbt matches
#caaabbct matches
#cbbccaat matches
#cbcbbaat doesn't match
#caaccbabbt doesn't match

This matches runs of a, b or c (that's the ([abc])\1* part), while the negative lookahead (?!.*\1) makes sure no other instance of that character is present after the run.

(edit: fixed a typo in the explanation)

回复收藏 0 原文

愿得七秒忆 2024-11-26 15:19:59

不确定您对正则表达式的重视程度，但这里有一个使用不同方法的解决方案：

from itertools import groupby

words = ['ct', 'cat', 'cbbt', 'caaabbct', 'cbbccaat',  'cbcbbaat', 'caaccbabbt']
for w in words:
    match = False
    if w.startswith('c') and w.endswith('t'):
        temp = w[1:-1]
        s = set(temp)
        match = s <= set('abc') and len(s) == len(list(groupby(temp)))
    print w, "matches" if match else "doesn't match"

如果一组中间字符是 set('abc') 的子集，则字符串匹配，并且字符串的数量groupby() 返回的组与集合中的元素数量相同。

Not sure how attached you are to regex, but here is a solution using a different method:

from itertools import groupby

words = ['ct', 'cat', 'cbbt', 'caaabbct', 'cbbccaat',  'cbcbbaat', 'caaccbabbt']
for w in words:
    match = False
    if w.startswith('c') and w.endswith('t'):
        temp = w[1:-1]
        s = set(temp)
        match = s <= set('abc') and len(s) == len(list(groupby(temp)))
    print w, "matches" if match else "doesn't match"

The string matches if a set of the middle characters is a subset of set('abc') and the number of groups returned by groupby() is the same as the number of elements in the set.

回复收藏 0 原文

简单 2024-11-26 15:19:59

我相信您需要显式编码 as、bs 和 cs 的所有可能排列：

c(a*b*c*|b*a*c*|b*c*a*|c*b*a*|c*a*b*|a*c*b*)t

请注意，这是一个效率极低的查询，可能会走回头路很多。

I believe you need to explicitly encode all possible permutations of as, bs and cs:

c(a*b*c*|b*a*c*|b*c*a*|c*b*a*|c*a*b*|a*c*b*)t

Note that this is an extremely inefficient query which may backtrack a lot.

回复收藏 0 原文

叹倦 2024-11-26 15:19:59

我不知道 Python 正则表达式引擎，但听起来你只是想直接写出 6 种不同的可能顺序。

/c(a*b*c*|a*c*b*|b*a*c*|b*c*a*|c*a*b*|c*b*a*)t/

I don't know the Python regex engine, but it sounds like you just want to write out the 6 different possible orderings directly.

/c(a*b*c*|a*c*b*|b*a*c*|b*c*a*|c*a*b*|c*b*a*)t/

回复收藏 0 原文

为你鎻心 2024-11-26 15:19:59

AFAIK 没有“紧凑”的方法来做到这一点......

c(a*(b*c*|c*b*)|b*(a*c*|c*a*)|c*(a*b*|b*a*))t

AFAIK there's no "compact" way of doing this...

c(a*(b*c*|c*b*)|b*(a*c*|c*a*)|c*(a*b*|b*a*))t

回复收藏 0 原文

~没有更多了~

关于作者

世俗缘

暂无简介

文章

27 人气

关注发私信

达拉崩吧

文章 0 评论 0

关注

PANGOO

文章 0 评论 0

关注

kkgtx

文章 0 评论 0

关注

WordPress小学生

文章 0 评论 0

关注

酷炫老祖宗

文章 0 评论 0

关注

硪扪都還晓

文章 0 评论 0

友情链接

文江博客

Python 模式匹配。匹配 'c[任意数量的连续 a、b、c 或 b、c、a 等。 ]t'

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

达拉崩吧

PANGOO

kkgtx

WordPress小学生

酷炫老祖宗

硪扪都還晓

友情链接

Python 模式匹配。匹配 'c[任意数量的连续 a、b、c 或 b、c、a 等。 ]t'

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

达拉崩吧

PANGOO

kkgtx

WordPress小学生

酷炫老祖宗

硪扪都還晓

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。