如何从正则表达式中找到语言?

发布于 2024-12-07 20:38:39 字数 195 浏览 8 评论 0原文

我如何在字母表 {a, b} 上找到以下正则表达式的语言?

aUb*
(ab*Uc)
ab*Ubc*
a*bc*Uac

编辑:在我疯狂地被否决之前,如果有人能向我展示解决这些问题的步骤,而不仅仅是解决方案,我将不胜感激。也许甚至可以引导我完成其中一个,这样我就可以自己完成剩下的事情。

谢谢!

How would I find the language for the following regular expressions over the alphabet {a, b}?

aUb*
(ab*Uc)
ab*Ubc*
a*bc*Uac

EDIT: Before i get downvoted like crazy, i'd appreciate it if someone could show me the steps towards solving these problems, not just the solution. Maybe even walking me through one so I can do the rest on my own.

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

暗喜 2024-12-14 20:38:39

编辑:简短的回答,* 在几乎所有正则表达式/语法语法中表示“零次或多次重复”,包括 perl5 和 RFC 5234。* 通常比串联和交替绑定更紧密。

您说您想要一种基于字母表 (a, b) 的语言,但在表达式中包含 cU 。我假设您想要一个字母表(a、b、c)上的语言语法,其形式类似于 BNF 给定正则表达式,其中 U 是低优先级联合运算符,类似于Perl re 中的 |

在那种情况下,

<前><代码>a|b*

等价于BNF:

L := a
   | <M>
M := epsilon
   | b <M>

*运算符表示零或多个,因此M中的第一个子句是基本情况,第二个子句是的递归使用>M 包括终端 b

L 只是单个终结符a 或非终结符M

<前><代码>(ab*|c)

L ::= a <M>
    | c
M ::= epsilon
    | b <M>

M 的推导方式与上面相同,其余的就不言自明了。

<前><代码>ab*|bc*

L ::= a <M>
    | b <N>
M ::= epsilon
    | b <M>
N ::= epsilon
    | c <N>

N的推导方式与上述M相同。

<前><代码>a*bc*|ac

大多数正则表达式语言中的 * 比连接更紧密地结合,因此这与归结为相同

(a*b(c*))|(ac)

L ::= <M> b <N>
    | a c
M ::= epsilon
    | a <M>
N ::= epsilon
    | c <N>

一般要将正则表达式转换为 BNF,只需在正则表达式中使用邻接来表示邻接BNF 和正则表达式中的 U| 表示 BNF 中的 |

如果定义非终结符 ; ::= x 那么你可以这样处理 x*

L ::= epsilon
    | <X> <L>

使用相同的非终结符 ; ::= x 那么你可以这样处理 x+

L ::= <X>
    | <L> <X>

这会得到重复运算符 *+ ,剩下 x? 很简单

L ::= epsilon
    | <X>

Edit: short answer, * means "zero or more repetitions" in almost all regex/grammar syntaxes include perl5 and RFC 5234. * typically binds more tightly than concatenation and alternation.

You say you want a language over the alphabet (a, b), but include c and U in your expressions. I'm going to assume that you want a language grammar over the alphabet (a, b, c) in a form like BNF given a regular expression where U is a low-precedence union operator, similar to | in perl re.

In that case,

a|b*

is equivalent to the BNF:

L := a
   | <M>
M := epsilon
   | b <M>

The * operator means zero or more, so the first clause in M is the base case, and the second clause is a recursive use of M that includes a terminal b.

L is just either a single terminal a or the nonterminal M.

(ab*|c)
L ::= a <M>
    | c
M ::= epsilon
    | b <M>

M is derived the same way as above, and the rest is self explanatory.

ab*|bc*
L ::= a <M>
    | b <N>
M ::= epsilon
    | b <M>
N ::= epsilon
    | c <N>

N is derived in the same way as M above.

a*bc*|ac

The * in most regular expression languages binds more tightly than concatenation, so this is the same as

(a*b(c*))|(ac)

which boils down to

L ::= <M> b <N>
    | a c
M ::= epsilon
    | a <M>
N ::= epsilon
    | c <N>

In general to convert a regex to BNF, you simply use adjacency in a regex to mean adjaceny in BNF, and U or | in a regex means | in BNF.

If you define a nonterminal <X> ::= x then you can handle x* thus:

L ::= epsilon
    | <X> <L>

With the same nonterminal <X> ::= x then you can handle x+ thus:

L ::= <X>
    | <L> <X>

That gets you the repetition operators * and + which leaves ?. x? is simply

L ::= epsilon
    | <X>
演出会有结束 2024-12-14 20:38:39

如果您知道星号、并集和串联的含义,那么这些应该很容易。第一个是联盟b星。根据运算顺序,这意味着一个联合(b星)。联合意味着左侧语言中的任何内容或右侧语言中的任何内容。 a表示由长度为1的字符串a组成的语言; b 星表示由任何字符串组成的语言,该字符串由零个或多个 b 符号组成,仅此而已。所以这个语言是{empty, a, b, bb, bbb, bbbb, ...}。在第二个中,ab* 表示由 a 后跟零个或多个 b 符号组成的所有字符串。首先加星号,然后串联,然后并集,遵循给定的显式括号。

If you know what star, union and concatenation mean, these should be easy. The first is a union b star. According to order of operations, this means a union (b star). Union means anything in the language on the left or anything in the language on the right. a means the language consisting of the length-one string a; b star means the language consisting of any string which consists of zero or more b symbols and nothing else. So this language is {empty, a, b, bb, bbb, bbbb, ...}. In the second one, ab* means all strings consisting of an a followed by zero or more b symbols. Do star first, then concatenation, then union, obeying the given explicit parentheses.

罗罗贝儿 2024-12-14 20:38:39

尽管迈克给出了生成由正则表达式表示的语言的语法,但您的作业要求语言本身。因为您正在处理正则表达式,所以您的答案必须是正则集。

回想一下字母表上正则集的定义:

Let Σ be an alphabet. The class of regular sets over Σ is the smallest class
containing ∅, {λ}, and {a}, for all a ∈ Σ, and closed under union, concatenation,
and Kleene star.

现在回想一下字母表上正则表达式的定义:

Let Σ be an alphabet. The class of regular expressions over Σ is the smallest
class containing ∅, λ, and a, for all a ∈ Σ, and closed under union, concat-
enation, and Kleene star.

因此,翻译应该很简单。事实上,它包括在每个字母周围插入大括号!例如:

a ∪ b*  denotes {a} ∪ {b}*
ab* ∪ c denotes {a}{b}* ∪ {c}
...

如果您想用集合构建器表示法来表达每个正则表达式的语言,您可以恢复到正则集合上的运算的定义。回想一下:

Let A and B be regular sets. Then
   1  A ∪ B = {x : x ∈ A ∨ x ∈ B}
   2. AB    = {xy : x ∈ A ∧ y ∈ B}
   3. A*    = ∪[i = 0 -> ∞]A^i

通过替换常规集合运算的定义,可以将常规集合转换为集合构建器表示法。为了避免引入嵌套的集合构建器表示法,我将相等与串联的定义结合使用来表达常规集合的串联。

{a} ∪ {b}*    = {w : w ∈ {a} ∨ w ∈ ∪[i = 0 -> ∞]{b}^i}
{a}{b}* ∪ {c} = {w : (w = xy ∧ (x ∈ {a} ∧ y ∈ ∪[i = 0 -> ∞]{b}^i)) ∨ w ∈ {c}}
...

您现在应该能够毫无困难地找到其余表达式所表示的语言。

Although Mike gave grammars generating the languages denoted by the regular expressions, your assignment requests the languages themselves. Because you're dealing with regular expressions, your answers must be regular sets.

Recall the definition of regular sets over an alphabet:

Let Σ be an alphabet. The class of regular sets over Σ is the smallest class
containing ∅, {λ}, and {a}, for all a ∈ Σ, and closed under union, concatenation,
and Kleene star.

Now recall the definition of regular expressions over an alphabet:

Let Σ be an alphabet. The class of regular expressions over Σ is the smallest
class containing ∅, λ, and a, for all a ∈ Σ, and closed under union, concat-
enation, and Kleene star.

The translation, therefore, should be straightforward. In fact, it consists of inserting curly brackets around each letter! For example:

a ∪ b*  denotes {a} ∪ {b}*
ab* ∪ c denotes {a}{b}* ∪ {c}
...

If you want to express the language of each regular expression in set-builder notation, you can revert to the definitions of the operations over regular sets. Recall:

Let A and B be regular sets. Then
   1  A ∪ B = {x : x ∈ A ∨ x ∈ B}
   2. AB    = {xy : x ∈ A ∧ y ∈ B}
   3. A*    = ∪[i = 0 -> ∞]A^i

The regular sets can be translated into set builder notation by substitution of the definitions of the regular set operations. To avoid introducing nested set-builder notation, I've used equality in conjunction with the definition of concatenation to express the concatenation of regular sets.

{a} ∪ {b}*    = {w : w ∈ {a} ∨ w ∈ ∪[i = 0 -> ∞]{b}^i}
{a}{b}* ∪ {c} = {w : (w = xy ∧ (x ∈ {a} ∧ y ∈ ∪[i = 0 -> ∞]{b}^i)) ∨ w ∈ {c}}
...

You should now be able to find the languages denoted by the remaining expressions without difficulty.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文