如何从正则表达式中找到语言?
我如何在字母表 {a, b} 上找到以下正则表达式的语言?
aUb*
(ab*Uc)
ab*Ubc*
a*bc*Uac
编辑:在我疯狂地被否决之前,如果有人能向我展示解决这些问题的步骤,而不仅仅是解决方案,我将不胜感激。也许甚至可以引导我完成其中一个,这样我就可以自己完成剩下的事情。
谢谢!
How would I find the language for the following regular expressions over the alphabet {a, b}?
aUb*
(ab*Uc)
ab*Ubc*
a*bc*Uac
EDIT: Before i get downvoted like crazy, i'd appreciate it if someone could show me the steps towards solving these problems, not just the solution. Maybe even walking me through one so I can do the rest on my own.
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
编辑:简短的回答,
*
在几乎所有正则表达式/语法语法中表示“零次或多次重复”,包括 perl5 和 RFC 5234。*
通常比串联和交替绑定更紧密。您说您想要一种基于字母表 (a, b) 的语言,但在表达式中包含
c
和U
。我假设您想要一个字母表(a、b、c)上的语言语法,其形式类似于 BNF 给定正则表达式,其中U
是低优先级联合运算符,类似于Perl re 中的|
在那种情况下,
等价于BNF:
*
运算符表示零或多个,因此M
中的第一个子句是基本情况,第二个子句是的递归使用>M
包括终端b
。L 只是单个终结符
a
或非终结符M
。M 的推导方式与上面相同,其余的就不言自明了。
N的推导方式与上述M相同。
大多数正则表达式语言中的
*
比连接更紧密地结合,因此这与归结为相同。
一般要将正则表达式转换为 BNF,只需在正则表达式中使用邻接来表示邻接BNF 和正则表达式中的
U
或|
表示 BNF 中的|
。如果定义非终结符
; ::= x
那么你可以这样处理x*
:使用相同的非终结符
; ::= x
那么你可以这样处理x+
:这会得到重复运算符
*
和+
,剩下?
。x?
很简单Edit: short answer,
*
means "zero or more repetitions" in almost all regex/grammar syntaxes include perl5 and RFC 5234.*
typically binds more tightly than concatenation and alternation.You say you want a language over the alphabet (a, b), but include
c
andU
in your expressions. I'm going to assume that you want a language grammar over the alphabet (a, b, c) in a form like BNF given a regular expression whereU
is a low-precedence union operator, similar to|
in perl re.In that case,
is equivalent to the BNF:
The
*
operator means zero or more, so the first clause inM
is the base case, and the second clause is a recursive use ofM
that includes a terminalb
.L is just either a single terminal
a
or the nonterminalM
.M is derived the same way as above, and the rest is self explanatory.
N is derived in the same way as M above.
The
*
in most regular expression languages binds more tightly than concatenation, so this is the same aswhich boils down to
In general to convert a regex to BNF, you simply use adjacency in a regex to mean adjaceny in BNF, and
U
or|
in a regex means|
in BNF.If you define a nonterminal
<X> ::= x
then you can handlex*
thus:With the same nonterminal
<X> ::= x
then you can handlex+
thus:That gets you the repetition operators
*
and+
which leaves?
.x?
is simply如果您知道星号、并集和串联的含义,那么这些应该很容易。第一个是联盟b星。根据运算顺序,这意味着一个联合(b星)。联合意味着左侧语言中的任何内容或右侧语言中的任何内容。 a表示由长度为1的字符串a组成的语言; b 星表示由任何字符串组成的语言,该字符串由零个或多个 b 符号组成,仅此而已。所以这个语言是{empty, a, b, bb, bbb, bbbb, ...}。在第二个中,ab* 表示由 a 后跟零个或多个 b 符号组成的所有字符串。首先加星号,然后串联,然后并集,遵循给定的显式括号。
If you know what star, union and concatenation mean, these should be easy. The first is a union b star. According to order of operations, this means a union (b star). Union means anything in the language on the left or anything in the language on the right. a means the language consisting of the length-one string a; b star means the language consisting of any string which consists of zero or more b symbols and nothing else. So this language is {empty, a, b, bb, bbb, bbbb, ...}. In the second one, ab* means all strings consisting of an a followed by zero or more b symbols. Do star first, then concatenation, then union, obeying the given explicit parentheses.
尽管迈克给出了生成由正则表达式表示的语言的语法,但您的作业要求语言本身。因为您正在处理正则表达式,所以您的答案必须是正则集。
回想一下字母表上正则集的定义:
现在回想一下字母表上正则表达式的定义:
因此,翻译应该很简单。事实上,它包括在每个字母周围插入大括号!例如:
如果您想用集合构建器表示法来表达每个正则表达式的语言,您可以恢复到正则集合上的运算的定义。回想一下:
通过替换常规集合运算的定义,可以将常规集合转换为集合构建器表示法。为了避免引入嵌套的集合构建器表示法,我将相等与串联的定义结合使用来表达常规集合的串联。
您现在应该能够毫无困难地找到其余表达式所表示的语言。
Although Mike gave grammars generating the languages denoted by the regular expressions, your assignment requests the languages themselves. Because you're dealing with regular expressions, your answers must be regular sets.
Recall the definition of regular sets over an alphabet:
Now recall the definition of regular expressions over an alphabet:
The translation, therefore, should be straightforward. In fact, it consists of inserting curly brackets around each letter! For example:
If you want to express the language of each regular expression in set-builder notation, you can revert to the definitions of the operations over regular sets. Recall:
The regular sets can be translated into set builder notation by substitution of the definitions of the regular set operations. To avoid introducing nested set-builder notation, I've used equality in conjunction with the definition of concatenation to express the concatenation of regular sets.
You should now be able to find the languages denoted by the remaining expressions without difficulty.