正则表达式定义了带有 {a,b} 的正则语言,不包含恰好 3 个 b's (bbb) 的子字符串

发布于 2024-09-13 04:09:30 字数 409 浏览 3 评论 0原文

问题说的差不多了。我想出了

(ba)?(a + bb + bbbbb + aba)*(ab)?

还有什么更具可读性吗?或者这是不正确的? 我知道当您可以在代码中使用 !~/bbb/ 时,您实际上不应该使用正则表达式做这种事情,但这是一个理论练习。

谢谢。

编辑澄清:我没有使用 | 来表示正则表达式中的 OR 位,而是使用 + 来代替。抱歉造成混乱。

编辑 2: {a,b} 适用于仅包含“a”和“b”字符的语言。不是{最小值,最大值}。再次抱歉。

编辑 3:因为这是理论课程的一部分,所以我们只处理正则表达式的基础知识。您唯一可以使用的是 +、?、() 和 *。您不能使用{最小值,最大值)。

Pretty much what the question says. I came up with

(ba)?(a + bb + bbbbb + aba)*(ab)?

Is there anything more readable? Or is this incorrect?
I know you shouldn't really be doing this sorta thing with Regex when you can just go !~/bbb/ in your code, but it's a theory exercise.

Thanks.

Edit for Clarification: I'm not using | to represent the OR bit in the Regex and using + it instead. Sorry for the confusion.

Edit 2: {a,b} is for a language with just 'a' and 'b' characters. Not {mininum, maximum}. Sorry again.

Edit 3: Because this is part of a theory class, we're just dealing with the basics of Regex. The only things you're allowed to use are +, ?, () and *. You cannot use {minimum, maximum).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

山色无中 2024-09-20 04:09:30

我想我有一个有效的正则表达式。让 (这是我刚刚发明的一种表示法)作为匹配零个或多个 b 的正则表达式,但它不会匹配其中的三个。这可以替换为 (ε | b | bb | bbbb+),所以不用担心我使用了魔法或其他东西。现在我认为匹配字符串可以看作是零个或多个 a 后面跟着 的重复子模式,这可能是 (a*b°)*,但是你需要b 序列之间至少有一个“a”。所以你的最终正则表达式是a*b°(a+b°)*

由于 可以匹配空字符串,因此初始的 a* 是多余的,因为 a+ 可以拾取初始的 a 就可以了,所以正则表达式可以优化到 b°(a+b°)* (谢谢,wrikken)。

I think I have a working regex. Let —which is a notation I invented just now—be the regex that matches zero or more b's, except it won't match three of them. This can be replaced by (ε | b | bb | bbbb+), so don't worry that I'm using magic or anything. Now I think that matching strings can be seen as repeating subpatterns of zero or more a's followed by , which could be (a*b°)*, but you need there to be at least one "a" in between sequences of b's. So your final regex is a*b°(a+b°)*.

Since can match the empty string, the initial a* is superfluous as the a+ can pick up the initial a's just fine, so the regex can be optimized down to b°(a+b°)* (thanks, wrikken).

醉梦枕江山 2024-09-20 04:09:30

嗯,有这样的事吗?

^(a|(?<!b)b{1,2}(?!b)|b{4,})*$

编辑

编辑 3:因为这是理论课程的一部分,所以我们只处理正则表达式的基础知识。您唯一可以使用的是 +、?、() 和 *。您不能使用{最小值、最大值)。

噗,说的是把你的双手绑在背后……简单的解决方案:你做不到(^ & $ 是它的要求永远工作),我们需要 |。所以,想出一个更好的条件。放弃后视和前瞻可以可以完成,但不会很漂亮(至少,在不违反 DRY 的情况下):

^(b|bb|bbbb+)?(a+(b|bb|bbbb+)?)*$

Hmm, something like this?

^(a|(?<!b)b{1,2}(?!b)|b{4,})*$

edit:

Edit 3: Because this is part of a theory class, we're just dealing with the basics of Regex. The only things you're allowed to use are +, ?, () and *. You cannot use {minimum, maximum).

Pfff, talking about tying your hands behind your back... Simple solution: you cannot do it (^ & $ are requirements for it ever to work), and we need the |. So, come up with a better conditions. Dropping the lookbehind & lookahead could be done, but isn't going to be pretty (at least, not without violating DRY):

^(b|bb|bbbb+)?(a+(b|bb|bbbb+)?)*$
豆芽 2024-09-20 04:09:30

您正在匹配一个连续不包含 3 个 b 的字符串。这意味着您正在查看诸如“aa”、“aba”、“abba”和“abbbbb*a”之类的子字符串,其中任何外部 a 都可以是字符串的开头或结尾,可以重叠,并且可以是多个。这表明:

(a + ab + abb + abbbbb*)*

通过适当的添加来解决字符串开头缺少的 a 的问题。有很多重复,但这就是正则表达式基本形式的工作方式。

You're matching a string without precisely 3 b's in a row. That means you're looking at substrings like "aa", "aba", "abba", and "abbbbb*a", where any of the exterior a's could be the beginning or end of the string, can be overlapped, and can be multiple. This suggests something like:

(a + ab + abb + abbbbb*)*

with appropriate additions to account for the missing a at the beginning of the string. There's a lot of repetitions, but that's how regular expressions work in basic form.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文