正则表达式匹配 0 和 1 的字符串,不带“011”子串
我正在解决一个问题(来自 Hopcroft、Motwani 和 Ullman 的自动机理论、语言和计算机简介),编写一个正则表达式来定义由所有 0< 字符串组成的语言/code>s 和
1
s 不包含子字符串 011
。
答案 (0+1)* - 011
正确吗?如果不是,正确答案应该是什么?
I'm working on a problem (from Introduction to Automata Theory, Languages and Computer by Hopcroft, Motwani and Ullman) to write a regular expression that defines a language consisting of all strings of 0
s and 1
s not containing the substring 011
.
Is the answer (0+1)* - 011
correct ? If not what should be the correct answer for this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
编辑:更新为包括启动状态和修复,如以下评论所示。
Edit: Updated to include start states and fixes, as per below comments.
如果您正在查找不包含
011
作为子字符串的所有字符串,而不是简单地排除字符串011
:一个经典的正则表达式是:
基本上您可以将一开始有很多你想要的,但是一旦你达到零,它要么是零,要么是零一(因为否则你会得到一个零一一)。
一个现代的、非真正规则的正则表达式是:
但是,如果您想要任何不是
011
的字符串,您可以简单地枚举短字符串并使用通配符其余部分:在现代正则表达式中:
If you are looking for all strings that do not have
011
as a substring rather than simply excluding the string011
:A classic regex for that would be:
Basically you can have as many ones at the beginning as you want, but as soon as you hit a zero, it's either zeros, or zero-ones that follow (since otherwise you'd get a zero-one-one).
A modern, not-really-regular regex would be:
IF, however, you want any string that is not
011
, you can simply enumerate short string and wildcard the rest:And in modern regex:
最简单的编码(和理解)是:
这是禁止子字符串的否定前瞻
(?!.*011)
,由 1 和 0 组成[10]* (如果空白不通过,则更改为
[10]+
)The simplest to code (and understand) is:
This is a negative look ahead
(?!.*011)
for the prohibited substring, and composed of 1's and 0's[10]*
(change to[10]+
if blank is not a pass)正则表达式如下:
0^* ∣1 + 0^* ∣0^* 01 + 0^*
Here is the regular expression:
0^* ∣1 + 0^∗ ∣0^* 01 + 0^*