推广 UNIX 风格正则表达式的泵引理
除了常见的 **
、+
、?*
运算符之外,大多数 UNIX 正则表达式还具有反斜杠运算符,其中 \1,\2 ,...
匹配最后一个括号中的内容,因此例如 *L=(a*)b\1*
匹配(非常规)语言 *a^nba ^n*
。
一方面,这似乎非常强大,因为您可以创建 (a*)b\1b\1
来匹配语言 *a^nba^nba^n*
甚至堆栈自动机都无法识别。另一方面,我很确定 *a^nb^n*
不能用这种方式表达。
我有两个问题:
- 是否有关于该语言系列(UNIX-y 常规)的文献。特别是,是否有针对这些的泵送引理的版本?
- 有人可以证明或反驳
*a^nb^n*
不能这样表达吗?
Most UNIX regular expressions have, besides the usual **
,+
,?*
operators a backslash operator where \1,\2,...
match whatever's in the last parentheses, so for example *L=(a*)b\1*
matches the (non regular) language *a^n b a^n*
.
On one hand, this seems to be pretty powerful since you can create (a*)b\1b\1
to match the language *a^n b a^n b a^n*
which can't even be recognized by a stack automaton. On the other hand, I'm pretty sure *a^n b^n*
cannot be expressed this way.
I have two questions:
- Is there any literature on this family of languages (UNIX-y regular). In particular, is there a version of the pumping lemma for these?
- Can someone prove, or disprove, that
*a^n b^n*
cannot be expressed this way?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可能正在寻找
当然,向前和向后跟踪他们的引用以查找更多关于这个主题的文献。
You're probably looking for
and of course follow their citations forward and backward to find more literature on this subject.
a^nb^n 是 CFL。语法是
你可以使用 RL 的泵引理来证明 A 不是 RL
a^n b^n is CFL. The grammar is
you can use pumping lemma for RL to prove A is not RL
Ruby 1.9.1 支持以下正则表达式:
“Fun with Ruby 1.9 Regular Expressions" 有一个示例,他实际上排列了正则表达式的所有部分,使其看起来像上下文无关语法,如下所示:
我认为这意味着至少 Ruby 1.9 .1 的正则表达式引擎(即 Oniguruma 正则表达式引擎)实际上相当于上下文无关语法,尽管捕获组不如实际的解析器生成器那么有用。
这意味着“上下文无关语言的泵引理”应该描述语言类别可由 Ruby 1.9.1 的正则表达式引擎识别。
编辑:哎呀!我搞砸了,没有做重要的测试,这实际上使我上面的答案完全错误。我不会删除答案,因为它仍然是有用的信息。
编辑:几个月后回到这里,我发现我在上次编辑中的测试是不正确的。即使
regex
确实像上下文无关语法一样运行,也不应该期望"aaacbbb"
与regex
匹配。正确的测试应该是在像
"aabcbaa"
这样的字符串上,并且它与正则表达式匹配:Ruby 1.9.1 supports the following regex:
"Fun with Ruby 1.9 Regular Expressions" has an example where he actually arranges all the parts of a regex so that it looks like a context-free grammar as follows:
I think this means that at least Ruby 1.9.1's regex engine, which is the Oniguruma regex engine, is actually equivalent to a context-free grammar, though the capturing groups aren't as useful as an actual parser-generator.
This means that "Pumping lemma for context-free languages" should describe the class of languages recognizable by Ruby 1.9.1's regex engine.
EDIT: Whoops! I messed up, and didn't do an important test which actually makes my answer above totally wrong. I won't delete the answer, because it's useful information nonetheless.
EDIT: Coming back to this many months later, I just discovered that my test in the last edit was incorrect.
"aaacbbb"
shouldn't be expected to matchregex
even ifregex
does operate like a context-free grammar.The correct test should be on a string like
"aabcbaa"
, and that does match the regex: