.NET 的正则表达式图灵完备吗?
正则表达式通常被认为是未完成语言的经典示例。例如,“正则表达式”作为此问题的答案 寻找非图灵完备的语言。
根据我对转向完整性概念的理解(也许有些基本),这意味着正则表达式不能用于检查“平衡”模式。平衡的含义具有相同数量的开始字符和结束字符。这是因为要做到这一点需要您具有某种状态,以允许您匹配开始和结束字符。
然而,正则表达式的 .NET 实现引入了平衡组的概念< /a>.此构造旨在让您回溯并查看先前的组是否匹配。这意味着 .NET 正则表达式:
^(?<p>a)*(?<-p>b)*(?(p)(?!))$
可以匹配以下模式:
ab
aabb
aaabbb
aaaabbbb
... etc. ...
这是否意味着 .NET 正则表达式是图灵完备的?或者是否还缺少其他语言图灵完备所需的东西?
Regular expressions are often pointed to as the classical example of a language that is not Turning complete. For example "regular expressions" is given in as the answer to this SO question looking for languages that are not Turing complete.
In my, perhaps somewhat basic, understanding of the notion of Turning completeness, this means that regular expressions cannot be used check for patterns that are "balanced". Balanced meaning have an equal number of opening characters as closing characters. This is because to do this would require you to have some kind of state, to allow you to match the opening and closing characters.
However the .NET implementation of regular expressions introduces the notion of a balanced group. This construct is designed to let you backtrack and see if a previous group was matched. This means that a .NET regular expressions:
^(?<p>a)*(?<-p>b)*(?(p)(?!))$
Could match a pattern that:
ab
aabb
aaabbb
aaaabbbb
... etc. ...
Does this means .NET's regular expressions are Turing complete? Or are there other things that are missing that would be required for the language to be Turing complete?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
在计算理论中,正则表达式描述了正则语言。正则语言类正是那些可以被某种有限状态机识别的语言,或者可以由正则语法生成的语言。但是,您描述的示例(平衡短语)不是常规语言,无法被有限状态机识别或由常规语法生成。事实上,这是所谓的“上下文无关语言”的教科书示例。这些需要下推自动机进行识别。上下文无关语言是常规语言的超集,也是图灵完备语言的真子集。大多数编程语言的语法(而不是语义)是上下文无关的语言。如果您有兴趣了解有关此主题的更多信息,可以从乔姆斯基层次结构开始
In computation theory, a regular expression describes a regular language. The class of regular languages is precisely those languages that can be recognized by some finite state machine, or generated by a regular grammar. However, the example you have described (balanced phrases) is not a regular language and cannot be recognized by a finite state machine or generated by a regular grammar. In fact, this a textbook example of what is called a context-free language. These require a push-down automata for recognition. The class of context-free languages is a superset of regular languages but a proper subset of turing complete languages. The syntax (as opposed to semantics) of most programming languages is a context-free language. If you are interested in learning more about this topic, you can start with the Chomsky hierarchy
.NET 中的正则表达式不是图灵完备的,因为它们总是停止。对于一般的图灵机来说不能这样说。
Regex's in .NET are not Turing complete because they always halt. This cannot be said of a general turing machine.
你几乎错过了图灵完备的定义。
现在,您无法在正则表达式中执行某些操作,因此该语言不是图灵完备的。
你知道,你真的必须像其他人一样使用相同的定义。有限的理解应该促使我们找出真相。
You pretty much miss the definition of turing complete.
Now, you can not do certain things in regular expressions, so the langauge is not turing complete.
You really have to use the same definition like everyone else, you know. Limited understanding should trigger finding out the truth.
@Inuyasha:实际上你可以用正则表达式进行加法。好吧,至少检查计算是否正确完成。唯一的问题是您必须以奇怪的顺序向正则表达式提供输入(您不能使用正则表达式反转字符串(或检查它是否反转))。
模式是:
假设您要以二进制形式添加 1011 和 0110:
如果您按照从有效位到最大的顺序给出此输入,散布第一个操作数、第二个操作数和输出,您将得到字符串 101110010100001。可以通过
Which is a普通品种正则表达式来匹配。您可以将其扩展为十进制加法,但正则表达式会变得非常复杂。
@Inuyasha: Actually you can do addition with a regular expression. Well at least check if the computation is done correctly. Only thing is you have to give the input to the regex in a strange order (you can't reverse a string (or check if it is reversed) with a regex).
The pattern is:
Suppose you want to add 1011 an 0110 in binary:
If you give this input in the order of lease significant bits to greatest, interspersing the first operand, second operand, and the output, you would get the string 101110010100001. This can be matched by
Which is a garden variety regex. You could extend this to decimal addition but the regex would get crazy complicated.