可空性(正则表达式)

发布于 2024-10-10 04:01:59 字数 580 浏览 8 评论 0原文

在 Brzozowski 的“正则表达式的导数”和其他地方,如果 R 可为空,则函数 δ(R) 返回 λ,否则返回 ∅,包括如下子句:

δ(R1 + R2) = δ(R1) + δ(R2)
δ(R1 · R2) = δ(R1) ∧ δ(R2)

显然,如果 R1 和 < em>R2 可为空,则 (R1 · R2) 可为空,如果 R1R2 可为空,则 (< em>R1 + R2) 可为空。然而,我不清楚上述条款的含义。我的第一个想法,将 (+)、(·) 或布尔运算映射到正则集合是无意义的,因为在基本情况下,

δ(a) = ∅ (for all a ∈ Σ)
δ(λ) = λ
δ(∅) = ∅

λ 不是集合(集合也不是 δ 的返回类型,δ 是正则集合)表达)。此外,这个映射没有被指出,并且有一个单独的符号。我理解可为空性,但我不知道 δ 定义中的和、积和布尔运算的定义:如何从 δ(R1) ∧ δ(R2),例如,在 δ(R1 · R2) 的定义中?

In Brzozowski's "Derivatives of Regular Expressions" and elsewhere, the function δ(R) returning λ if a R is nullable, and ∅ otherwise, includes clauses such as the following:

δ(R1 + R2) = δ(R1) + δ(R2)
δ(R1 · R2) = δ(R1) ∧ δ(R2)

Clearly, if both R1 and R2 are nullable then (R1 · R2) is nullable, and if either R1 or R2 is nullable then (R1 + R2) is nullable. It is unclear to me what the above clauses are supposed to mean, however. My first thought, mapping (+), (·), or the Boolean operations to regular sets is nonsensical, since in the base case,

δ(a) = ∅ (for all a ∈ Σ)
δ(λ) = λ
δ(∅) = ∅

and λ is not a set (nor is a set the return type of δ, which is a regular expression). Furthermore, this mapping isn't indicated, and there is a separate notation for it. I understand nullability, but I'm lost on the definition of the sum, product, and Boolean operations in the definition of δ: how are λ or ∅ returned from δ(R1) ∧ δ(R2), for instance, in the definition off δ(R1 · R2)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

面如桃花 2024-10-17 04:01:59

我认为您将 +^ 分别映射到布尔 orand 是正确的。看起来您引用的两行涉及交替(求和)和串联(乘积):

δ(R1 + R2) = δ(R1) + δ(R2)

R1交替如果 R1 可为空,R2 可为空,或者 R1都可为空,则 code> 和 R2 也可为空R2 可以为空。

δ(R1 · R2) = δ(R1) ∧ δ(R2)

仅当 R1R2 都为空时,R1R2串联才可以为空。可以为空。

请参阅此处了解这些规则的 Haskell 实现。

I think you were right to map + and ^ to boolean or and and respectively. It looks like the two lines you cited deal with alternation (sum) and concatenation (product):

δ(R1 + R2) = δ(R1) + δ(R2)

The alternation of R1 and R2 is nullable if R1 is nullable, R2is nullable, or both R1 and R2 are nullable.

δ(R1 · R2) = δ(R1) ∧ δ(R2)

The concatenation of R1 and R2 is only nullable if both R1 and R2 are nullable.

See here for an Haskell implementation of these rules.

黑色毁心梦 2024-10-17 04:01:59

我认为你被作者所采取的符号自由所困扰。 δ(R) 的返回类型肯定是一个集合,或者更确切地说是一种语言。如果你看一下定义:

alt text

你会发现返回类型不一致,形式上 λ 是一个元素,但 ∅ 是空语言...应该说的是:

alt text

作者使用 λ 的事实对于空字符串以及仅包含空字符串的语言,他对 Kleene 星运算符的定义进一步证明了这一点:

alt text

显然,如果我们想要迂腐的话,最后一部分应该是 alt text

鉴于 δ(R) 的返回类型是一个集合,或者更确切地说是一种语言,您给出的方程非常有意义,并且准确地表达了您所描述的内容。

I think you're getting caught out by the notational liberties taken by the author. The return type of δ(R) is most certainly a set, or rather a language. If you look at the definition:

alt text

you can see that there is an inconsistency in the return type, formally λ is an element, but ∅ is the empty language... What it should say is:

alt text

The fact that the author uses λ for both the empty string as well as the language containing only the empty string is further evidenced by his definition of the Kleene star operator:

alt text

Clearly, the last part should be alt text if we want to be pedantic.

Given that the return type of δ(R) is a set, or rather a language, the equations you give make perfect sense and express exactly what you described.

古镇旧梦 2024-10-17 04:01:59

(我无法查看 Brzozowski 的文章以便更好地理解其中的含义),但我可以建议两种方法来解释此符号(除了与符号相处之外,我明白,毫无疑问:预期的这个定义的含义很好理解):

1)在定义的左侧,我们只有正则表达式的“语法”模式。右边,我们生产套装;请记住,正则表达式是表示一种语言(一组)​​的一种方式,因此这种写下定义的方式就变得可以理解了:在右侧,我们简单地使用一些(简单的)正则表达式作为引用的简短方式套。即,∅ 表示空语言(空集合),而 λ(如果解释为正则表达式)表示仅包含空单词的语言(具有该元素的集合)。

这些运算只是集合上的运算:可能是并集和交集。

如果以这种方式解释符号,则与用来违背基本情况的符号并不矛盾:同样,“a”是一个正则表达式,代表带有单词“a”的语言。

2)我们首先在右边构建正则表达式,但作者用楔形扩展了构建正则表达式的操作,具有语言交集的语义。

(I can't look into the article by Brzozowski in order to understand better what is meant there), but I can suggest 2 ways to interprete this notation (apart from gettingalong with the notation, I see, there is no question: the intended sense of this definition is well understood):

1) On left of the definition, we have just "syntactic" patterns for the regular expressions. On the right, we produce sets; remember, that a regular expression is a way to denote a language (a set), and so this way to write down the definition becomes understandable: on the right, we simply use some (simple) regular expressions as a short way to refer to sets. I.e.,∅ means the empty language (the empty set), and λ (if interpreted as a regular expression) means the language containing just the empty word (the set with this element).

The operations are simply operations on sets: probably union, and intersection.

If the notation is interpreted this way, there is no contradiction with the used notation to defian the base case: again, "a" is a regular expression which stands there to mean the language with the word "a".

2) We build regular expressons on the right, in the first place, but the author has extended the operations which build regular expressions with the wedge, which has the semantics of intersection of languages.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文