当前位置：文江博客话题详情

正则表达式等价

发布于 2024-12-10 10:01:42 字数 165 浏览 6 评论 0原文

以下正则表达式等价是否正确？为什么或为什么不呢？

(ab)* u (aba)* = (ab u aba)*

*=克莱恩星
u=并集（集合论）

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

粉红×色少女 2024-12-17 10:01:42

不，它们并不等同。右侧的语言包含“abaab”，但左侧的语言不包含“abaab”。这些之间有什么关系吗？是的;但我不会只给你答案。 提示：RHS 中是否有 LHS 中没有的字符串？

编辑：

只是为感兴趣的读者解释一下。语言是字符串的集合。因此，集合之间的关系也是语言之间的关系。最常见的集合关系是相等、子集和超集。给定全域集合 U1 和 U2 上的两个集合 A 和 B，集合 A 是全域集合 U1 u U2 下 B 的子集（u 代表并集）当且仅当 A 的每个元素也是 B 的元素。同样，给定两个集合 A和 B 在宇宙集合 U1 和 U2 上，集合 A 是宇宙集合 U1 u U2 下 B 的超集（u 代表并集）当且仅当 B 的每个元素也是 A 的元素（等效地，当且仅当 B 是 A 的子集）。集合 A 和 B 相等，当且仅当 A 是 B 的子集且 B 是 A 的子集（等效地，当且仅当 A 是 B 的超集且 B 是 A 的超集）。请注意，两个集合 A 和 B 不必处于这三种关系中的任何一种；当 A 包含 B 中不存在的元素，同时 B 包含 A 中不存在的元素时，就会发生这种情况。

要找出集合 A 和 B 之间存在四种可能的关系中的哪一种 - 相等、子集、超集、无 - 你通常检查 A 是否是 B 的子集以及 B 是否是 A 的子集。调用第一个检查 B1 和第二个检查 B2，其中 B1 和 B2 是布尔变量，并且当且仅当检查通过时为 true（即 A 是 A 的子集） B1的情况下为B，并且在 B2) 的情况下，B 是 A 的子集。那么(B1 && B2)对应于等式，(B1 && !B2)对应于子集，(!B1 && B2)对应于超集，(!B1 && !B2)对应于超集没有任何关系。

在上面的示例中，我通过证明右侧包含左侧不包含的元素来证明两种语言不相等。请注意，这也排除了“A 是 B 的超集”关系。剩下两个：B 是 A 的超集，或者它们之间没有关系。要确定这一点，您必须确定 A 是否是 B 的子集； LHS 语言中的元素是否全部是 RHS 语言中的。如果是，则 LHS 是一个子集；否则，就没有关系。

为了证明一个集合中有一个元素而不是另一个集合中的元素，最简单、最有说服力的方法是通过反例证明：在一个集合中命名一个字符串，而在另一个集合中不命名。这是我采用的方法。您还可以提出一个论点，即该语言必须包含这样的元素，而无需显式命名它；这种证明的正确性要困难得多。

为了证明集合 A 的每个元素也在集合 B 中，您将需要更通用的证明技术。归纳证明和反证法是常见的例子。要进行归纳证明，您可以显式或隐式地为语言中的每个字符串分配一个自然数，并通过一个简单的参数证明您的主张（在本例中，该元素也是另一个集合的元素）是正确的。然后，您假设集合中的前 n 个元素为真（根据您给出的编号），并表明这意味着后面出现的所有元素也必须满足您的要求。为了通过反证法进行证明，你假设与你想要证明的相反，并得出矛盾。如果你成功了并且你唯一的假设是你的主张是错误的，那么你的主张一定一直都是正确的。

No, they aren't equivalent. The language on the RHS contains "abaab", but the language on the LHS doesn't. Is there any relationship among these? Yes; but I won't just give you the answer. Hint: are there any strings in the RHS that aren't in the LHS?

EDIT:

Just to expound a little for the interested reader. Languages are sets of strings. Therefore, relationships among sets are also relationships among languages. The most common set relationships are equality, subset, and superset. Given two sets A and B over universe sets U1 and U2, set A is a subset of B under universe set U1 u U2 (u stands for union) iff every element of A is also an element of B. Similarly, given two sets A and B over universe sets U1 and U2, set A is a superset of B under universe set U1 u U2 (u stands for union) iff every element of B is also an element of A (equivalently, iff B is a subset of A). Sets A and B are equal iff A is a subset of B and B is a subset of A (equivalently, iff A is a superset of B and B is a superset of A). Note that two sets A and B need not be in any of these three relationships; this happens when A contains an element not in B and, at the same time, B contains an element not in A.

To find which of the four possible relationships exist between sets A and B - equality, subset, superset, none - you usually check whether A is a subset of B and whether B is a subset of A. Call the first check B1 and the second check B2, where B1 and B2 are Boolean variables and are true iff the checks pass (i.e., A is a subset of B in the case of B1, and B is a subset of A in the case of B2). Then (B1 && B2) corresponds to equality, (B1 && !B2) corresponds to subset, (!B1 && B2) corresponds to superset and (!B1 && !B2) corresponds to no relationship.

In the example above, I demonstrate that the two languages are not equal by demonstrating that the RHS contains an element not in the LHS. Note that this also rules out the "A is a superset of B" relationship. Two remain: B is a superset of A, or there is no relationship between them. To decide this, you must determine whether A is a subset of B; whether the elements in the LHS language are all in the RHS language. If so, the LHS is a subset; otherwise, there is no relationship.

To show that there is an element in one set and not another, the easiest and most convincing approach is a proof by counterexample: name a string in one but not the other. This is the approach I adopted. You can also make an argument that the language must contain such an element without explicitly naming it; this kind of proof can be significantly harder to get right.

To show that every element of set A is also in set B, you will need a more generic proof technique. Proof by induction and proof by contradiction are common examples. To do an inductive proof, you assign - explicitly or implicitly - a natural number to each string in the language, demonstrate that your claim (in this case, that the element is also an element of the other set) is true with a simple argument. Then, you assume it is true for the first n elements in your set (according to the numbering you give) and show that this implies all the elements that come afterwards must also satisfy your claim. To do a proof by contradiction, you assume the opposite of what you want to prove, and derive a contradiction. If you succeed and your only assumption was that your claim is false, then your claim must have been true all along.

回复收藏 0 原文