什么是规律性?

发布于 2024-08-18 00:11:50 字数 250 浏览 4 评论 0原文

这更像是一个计算机科学问题,而不是一个编程问题,但我认为这是所有相关网站中提出这个问题的最佳地点。

当我发现正则表达式并查找该术语时,我认为“规则性”的这种属性是指表达式的语言具有可定义的结构模式这一事实。然而,在阅读有关该主题及其背后的理论时,我了解到有些语言是不规则的,但从它们的定义方式来看,很明显有一种模式可以与它们匹配。一种这样的语言是 (a^n)(b^n)。显然这是一种模式,但这不是常规语言。所以现在我想知道是什么让常规语言成为常规语言,而这种语言却不是?

This is more of a computer science question than a programming one, but I figure that this is the best place out of all the related sites to ask this.

When I discovered Regular Expressions and looked up the term I assumed that this property of "regularity" refers to the fact that the expression's language has a definable structural pattern. However, in reading about the subject and the theory behind this I learned that there are kinds of languages that are not regular, and yet from the way they are defined it's clear that a pattern can be matched to them. One such language is (a^n)(b^n). Clearly this is a pattern, and yet this is not a regular language. So now I'm left wondering what is it about regular languages that makes them regular, and this language not?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

青衫负雪 2024-08-25 00:11:50

直观地解释计算机科学是......棘手的。我会尝试一下,但请记住,其中一些内容将“足够接近”,但理论上并不严格。

常规语言是一种可以由计算等效于有限自动机(DFA/NDFA)的机器决定的语言。有限自动机可以被认为是纯粹在状态下运行的机器,没有存储。所以你可以看到 anbn 不能是正则的,因为它需要一台可以计算 a 和 b 数量的机器(因此必须具有无限*存储容量)以便对它们进行比较。

作为比较,(abc)n 是正则的,因为重复次数无关。

要获得更严格(以及相应更密集的视图),请查看维基百科文章和链接页面。

*无限在这里并不重要,但我为了完整性而提到它。将其视为“幸运的是,总是足够的”存储可能更容易。

Intuitively explaining computer science is... tricky. I'll give it a shot, but keep in mind that some of this is going to be "close enough" but not theoretically rigorous.

A regular language is one that can be decided by a machine that is computational equivalent to a finite automata (DFA/NDFA). A finite automata can be thought of as a machine that operates purely in states, no storage. So you can see that anbn cannot be regular as it requires a machine that can count the number of a's and b's (and thus must have infinite* storage capacity) in order to compare them.

For comparison, (abc)n is regular, because the number of repetitions is irrelevant.

For a more rigorous (and correspondingly denser view) check the wikipedia article and linked pages.

*The infinite doesn't matter here, but I mention it for completeness. It might be easier to think of it as "luckily, always just enough" storage.

雨巷深深 2024-08-25 00:11:50

该名称的词源来自 Kleene 1950 年代的作品,该作品使用他为此目的创建的数学符号来描述正则集。请参阅

The etymology of the name comes from Kleene's 1950s work describing regular sets using his mathematical notation created for the purpose. See this.

橙味迷妹 2024-08-25 00:11:50

也许维基百科关于常规语言的文章可以比我们更好地解释它。不过,我会尝试一下。

从理论的角度来看,常规语言(字符串集)是可以使用有限状态自动机生成的语言。用程序员的话来说,这相当于说它可以使用正则表达式生成。因此,所有有限语言(字符串集合)都是正则语言,但也有一些无限语言,例如 anbn (n a 的所有字符串的语言)后跟 n b's) 无法使用 FSA 或正则表达式进行识别。有更强大的计算设备(例如现代计算机,使用图灵机建模) 可以识别这些语言。

正则表达式在字符串搜索编程中如此广泛使用的原因是,它们可以识别对我们程序员来说很重要的绝大多数字符串,同时可以实现非常快速搜索有限状态自动机。

Perhaps the Wikipedia article on regular languages can explain it better than we can. However, I'll give it a shot.

From a theoretical standpoint, a regular language (set of strings) is one that can be generated using a finite state automaton. In programmer terms, this is equivalent to saying it can be generated using regular expressions. Thus, all finite languages (sets of strings) are regular, but there are some infinite languages, such as anbn (the language of all strings of n a's followed by n b's) that cannot be recognized using a FSA or regular expressions. There are more powerful computational devices (such as modern computers, which are modeled using Turing Machines) which can recognize those languages.

The reason regular expressions are used so much in programming for string searching is that they can recognize the large majority of strings that are important to us programmers, and at the same time can be implemented to search very quickly using finite state automata.

一直在等你来 2024-08-25 00:11:50

正则表达式中的regular一词指的是正则的数学概念,而不是英语概念。就像数学中的“prime”这个词与“prime”牛肉没有什么关系一样。

它被CS(数学的一个分支)继承来指代更具体的概念:http:// en.wikipedia.org/wiki/Regular_language

The word regular in regular expression refers to the Mathematical concept of regular, not the English concept. Just like how the word prime in mathematics bear little relation to prime beef.

It's inherited by CS (which is a branch of mathematics) to refer to a more specific concept: http://en.wikipedia.org/wiki/Regular_language

冰雪之触 2024-08-25 00:11:50

正则表达式并不是真正的正则,这个名字是有词源的。

regular expression are not really regular, the name is etymological.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文