图灵完备性有多大用处?神经网络图灵完备吗?
在阅读一些关于循环神经网络图灵完备性的论文时(例如:神经网络的图灵可计算性,Hava T. Siegelmann 和 Eduardo D. Sontag,1991),我感觉那里给出的证明并不是真的实际的。例如,参考论文需要一个神经网络,其神经元活动必须具有无限精确性(以可靠地表示任何有理数)。其他证明需要无限大小的神经网络。显然,这并不是那么实际。
但我现在开始怀疑,要求图灵完备性是否有意义。按照严格的定义,现在没有一个计算机系统是图灵完备的,因为它们都无法模拟无限的磁带。
有趣的是,编程语言规范通常会保留图灵完备与否的开放性。这一切都归结为一个问题:它们是否总是能够分配更多内存以及函数调用堆栈大小是否是无限的。大多数规范并没有真正指定这一点。当然,所有可用的实现都受到限制,因此编程语言的所有实际实现都不是图灵完备的。
因此,您可以说所有计算机系统都与有限状态机一样强大,而不是更强大。
这让我想到了一个问题:图灵完备这个术语到底有多大用处?
回到神经网络:对于神经网络(包括我们自己的大脑)的任何实际实现,它们都不会能够表示无限多个状态,即根据图灵完备性的严格定义,它们不是图灵完备的。 那么神经网络是否图灵完备的问题有意义吗?神经网络
是否与有限状态机一样强大的问题很早就已经得到了回答(明斯基于 1954 年提出,答案当然是:是的)并且似乎也更容易回答。也就是说,至少在理论上,这已经证明它们与任何计算机一样强大。
其他一些更多关于我真正想知道的问题:
是否有任何理论术语可以更具体地描述计算机的计算能力? (鉴于其有限的内存空间)
如何将神经网络的实际实现与计算机的计算能力进行比较? (如上所述,图灵完备性并没有什么用处。)
While reading some papers about the Turing completeness of recurrent neural nets (for example: Turing computability with neural nets, Hava T. Siegelmann and Eduardo D. Sontag, 1991), I got the feeling that the proof which was given there was not really that practical. For example the referenced paper needs a neural network which neuron activity must be of infinity exactness (to reliable represent any rational number). Other proofs need a neural network of infinite size. Clearly, that is not really that practical.
But I started to wonder now if it does make sense at all to ask for Turing completeness. By the strict definition, no computer system nowadays is Turing complete because none of them will be able to simulate the infinite tape.
Interestingly, programming language specification leaves it most often open if they are turing complete or not. It all boils down to the question if they will always be able to allocate more memory and if the function call stack size is infinite. Most specification don't really specify this. Of course all available implementations are limited here, so all practical implementations of programming languages are not Turing complete.
So, what you can say is that all computer systems are just equally powerful as finite state machines and not more.
And that brings me to the question: How useful is the term Turing complete at all?
And back to neural nets: For any practical implementation of a neural net (including our own brain), they will not be able to represent an infinite number of states, i.e. by the strict definition of Turing completeness, they are not Turing complete. So does the question if neural nets are Turing complete make sense at all?
The question if they are as powerful as finite state machines was answered already much earlier (1954 by Minsky, the answer of course: yes) and also seems easier to answer. I.e., at least in theory, that was already the proof that they are as powerful as any computer.
Some other questions which are more about what I really want to know:
Is there any theoretical term which can say something more specific about the computational power of a computer? (given its limited memory space)
How can you compare the computational power of practical implementations of neural nets with computers? (Turing-completeness is not useful as argumented above.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
知道您的系统属于乔姆斯基层次结构中的哪个类几乎总是很高兴。这在更受限的类别中尤其重要,例如常规语言/有限自动机与上下文无关语言。另外,具备识别您要解决的问题属于哪个类的技能也很重要,否则人们可能会尝试做一些愚蠢的事情,例如仅使用正则表达式解析 HTML 或 XML,这是不可能的。
知道你的形式主义或系统正在完善,就表明你可以用它构建任何你想要的东西。它没有讲任何实用性,只是解决问题的可能性或不可能性。在考虑图灵焦油坑时,这是令人痛苦的事实,但也有许多图灵完整的系统是专门为利基目的而设计的,没有人应该梦想在生产环境中用于通用目的工作。
简而言之,对乔姆斯基层次结构的深入了解将在很多情况下帮助您,不仅可以选择正确类型的解析器;还可以帮助您解决问题。正则表达式、下推、CFG 或更强大,而且还可以选择正确的机器类型或形式主义来表达一般流程。
It is almost always nice to know which class in the Chomsky hierarchy your system is. This is especially important in the more constricted classes, like regular languages / finite automata vs context-free languages. Also having the skill to recognize which class your the problem you are trying to solve is in is also important, otherwise one might try to do silly things like parsing HTML or XML with regular expressions only, which is impossible.
Having the knowlegde that your formalism or system is turing complete makes a statement that you can build whatever you want with it. It does not say anything about practicality, just the possibility or impossibility of solving problems. This is painfully true when considering turing tarpits, but there is also many turing complete systems which are specifically made for niche purposes that no one should ever dream of using for general purpose work in a production setting.
In short, good knowledge of the Chomsky hierarchy will help you out in very many situations, not only for choosing the right type of parser; regexp, pushdown, CFG or more powerful, but also for choosing the right type of machine or formalism for expressing processes in general.
基本上这意味着使用图灵完备的编程语言或架构
你可以执行各种各样的算法……大多数是任何类型的算法。
非图灵语言的潜力要小得多。
Basically it means that with programming language or architecture which are Turing complete
you can execute wide variety of algorithms... mostly -- any kind of them.
Non-Turing languages are greatly tighter in potential.
答案很简单。如果你可以用它模拟 NOR 或 NAND 门,那么它就是图灵完备的,假设剩下的只是将事物组合在一起的问题。
The answer is very simple. If you can emulate a NOR or a NAND gate with it, then it is Turing Complete, assuming that the rest is just a matter of combining things together.
声明数学模型是图灵完备的目的是为了揭示模型在给定足够数量的资源(即无限)的情况下执行任何计算的能力,而不是表明是否有特定的实现模型确实拥有这些资源。非图灵完备模型将无法处理一组特定的计算,即使有足够的资源,这揭示了两个模型运行方式的差异,即使它们的资源有限资源。当然,要证明这个属性,您必须假设模型能够使用无限量的资源,但是即使资源有限,模型的这个属性也是相关的。有限。
The point of stating that a mathematical model is Turing Complete is to reveal the capability of the model to perform any calculation, given a sufficient amount of resources (i.e. infinite), not to show whether a specific implementation of a model does have those resources. Non-Turing complete models would not be able to handle a specific set of calculations, even with enough resources, something that reveals a difference in the way the two models operate, even when they have limited resources. Of course, to prove this property, you have to do have to assume that the models are able to use an infinite amount of resources, but this property of a model is relevant even when resources are limited.
时隔多年,让我自己来回答这个问题。
图灵完备性
因此,区分特定机器或设备(PC、RNN、人脑)和编程语言或编程语言的抽象机器。编程语言通常对其可以使用的内存量没有限制,因此编程语言可以是图灵完备的。参见官方C语言标准定义,其中使用抽象机来定义语义。主要问题是,哪些特定类(PC、RNN、人脑)允许制定抽象机器变体,这些抽象机器变体是图灵完备的吗?
循环神经网络(RNN)
关于神经网络的计算能力
经常被引用的论文是关于神经网络的计算能力,Siegelmann & Sonntag,1992,其中指出 RNN 是图灵完备的。
本文假设我们有有理数,分子/分母没有限制,即无限内存被编码为有理数,或无限精度的浮点数。另请参阅此处。通常我们不会以有理数(无限制)运算的方式对神经网络进行建模。当我们将自己限制在具有有限精度权重和激活的 (R)NN 时,论文中的证明就崩溃了,不再适用。因此,本文的相关性并不大。
最近有一篇论文,On the Practical Computational Power of Finite Precision RNNs for Language Recognition,Weiss 等人,2018 年,它正好解决了这个问题。
万能逼近器
众所周知,大多数标准神经网络都是万能逼近器。这表明,给定任何函数(非常数、有界和连续),并给定一些允许的阈值,您可以构造一个在允许的阈值内近似该函数的神经网络。
这是关于有限维向量空间。当我们谈论可计算性时,我们谈论的是序列,因此我们有一个无限维向量空间。因此这个属性并不那么相关。
没有外部记忆
问题是如何定义标准 RNN 的概念。在上面提到的论文中,它假设激活的精度无限。但我认为这不是一个合理的 RNN 概念,因为你从来没有这样的概念。由于这是有限的,标准 RNN 的概念没有其他方法可以拥有无限的内存。
因此,明确指出:
如果没有外部存储器,标准 RNN 以及 LSTM 都不是图灵完备的< /强>。
也没有直接的方法来定义 RNN 的概念,您可以在其中按需添加内存。
RNN 的记忆是最近的隐藏激活。添加更多内存意味着改变神经网络,即添加新的神经元,从而增加其内部运作。这就像改变程序本身一样。
使用外部存储器
有神经图灵机(NTM)和一些类似的模型,它们有一些某种外部存储器。
这里可以直接考虑 NTM 的概念,您可以在其中按需添加内存。
因此我们可以说NTM 的概念是图灵完备的。
有一些细节,例如头部使用的注意力类型,需要进行一些调整。该模型有一个后续版本,可微神经计算机 (DNC) 明确解决了这个问题,并且还有一些明确的机制来添加内存。
学习/训练
我们主要谈论的是理论计算能力。
一个非常不同的问题是神经网络是否可以学习这样的函数。即训练(梯度搜索)是否会导致神经网络学习到可计算函数。
人脑
我们可能将人脑(或任何大脑)解释为一种复杂的神经网络。我们还可以问一个问题,人脑(模型)是否是图灵完备的。参见例如 在这里。这个问题是一个棘手的问题。直觉会说我们能够执行任何类型的计算,因此人脑是图灵完备的。然而,我们在这里概述的论点表明 RNN 不是图灵完备的。重要的是记忆效应。在某些时候,人脑的记忆容量不足以对某些输入进行操作。因此我们需要外部存储器。所以,显然,人脑和外部记忆是图灵完备的。但这可能不是一个有趣的问题(并且还假设人脑可以有编码任何图灵机程序所需的大小,因此人脑没有上限,这不是真的)。更有趣的是只有人脑本身的问题,没有任何外部记忆。或者基本上,如何定义抽象机器或人脑的概念?这个概念是否允许无限内存?这个定义简单吗?
人脑中的记忆有一个方面与标准 RNN 中的有些不同:它可以高度泛化,并且访问某些激活的寻址机制是不同的。此外,它具有某种自适应权重(但也只能存储有限的信息)。因此,考虑到这一点,它实际上已经更类似于 NTM,而不仅仅是 RNN。人脑中有许多不同类型的记忆。其中一些与 RNN 一样静态(如当前激活),但其他类型允许像 NTM 那样的联想记忆。因此人脑也类似于PC,同样具有有限的内存,而PC的概念是具有无限的内存。
所以也许我们可以说,由于联想记忆,人脑的概念也具有无限的记忆,并且也可以根据需要编码任何程序,因此人脑的概念是图灵完成。也许这应该被称为抽象人脑(抽象机器的实例)。
After many years, let me give an answer to this question myself.
Turing completeness
int64
memory address limit or things like that. These are technical limits, which could be solved, e.g. by big ints, etc.) Thus, we can say that the concept of a PC is Turing complete. This is also called an abstract machine.So, differentiate between a specific machine or device (PC, RNN, human brain) and a programming language or abstract machine for a programming language. A programming language usually does not have a restriction on the amount of memory it can use, thus programming languages can be Turing complete. See the official C language standard definition, which uses an abstract machine to define the semantics. The main question is, which of those specific classes (PC, RNN, human brain) allows to formulate abstract machine variants, and are those abstract machine variants Turing complete?
Recurrent neural networks (RNN)
On the computational power of neural nets
An often cited paper is On the computational power of neural nets, Siegelmann & Sonntag, 1992, which states that RNNs are Turing complete.
This paper assumes that we have rational numbers without limits in the nominator/denominator, i.e. the infinite memory is encoded as rational numbers, or floating point numbers of infinite precision. See also here. Usually we do not model a NN in a way that is operates with rational numbers (without limit). When we restrict ourselves to (R)NNs with finite precision weights and activations, the proof in the paper falls appart, and does not apply anymore. Thus, this paper is not so relevant.
There is a more recent paper, On the Practical Computational Power of Finite Precision RNNs for Language Recognition, Weiss et al, 2018, which exactly addresses this.
Universal approximator
It is well known that most standard NNs are universal approximators. This states, that given any function (nonconstant, bounded, and continuous), and given some allowed threshold, you can construct a NN which approximates this function within the allowed threshold.
This is about finite dimensional vector spaces. When we speak about computability, we speak about sequences, thus we have an infinite dimensional vector space. Thus this property is not so relevant.
Without external memory
The question is how to define the concept of a standard RNN. In the paper mentioned above, it assumes infinite precision in the activations. But I argue this is not a reasonable concept of a RNN as you never have this. And with this being limited, there is no other way how a concept of a standard RNN can have infinite memory.
Thus, to state it explicitly:
Without external memory, the standard RNN, and also LSTM is not Turing complete.
There is also no straight-forward way how you could define a concept of a RNN, where you could add memory on demand.
The memory of a RNN are the most recent hidden activations. Adding more memory means to change the NN, i.e. adding new neurons, thus adding the internal workings of it. This is like changing the program itself.
With external memory
There is the Neural Turing machine (NTM) and a few similar models, which have some sort of external memory.
Here it is straight-forward to think about a concept of a NTM where you would add memory on demand.
Thus we can say that the concept of a NTM is Turing complete.
There are some details like the type of attentions used in the heads, which needs some adaptation. There is a follow up on the model, the Differentiable neural computer (DNC) which explicitly addresses this, and also has some explicit mechanism to add memory.
Learning / training
We spoke mostly about the theoretic computation power.
A very different question is whether the NN can learn such a function. I.e. whether training (gradient search) leads to a NN which has learned a computable function.
Human brain
We might interpret the human brain (or any brain) as kind of a complex neural network. We can also ask the question, whether the human brain (model) is Turing complete. See e.g. here. This question is a tricky one. The intuition would say that we are able to perform any kind of computation, thus the human brain is Turing complete. However, the arguments we have outlined here shows that a RNN is not Turing complete. Important is again the memory effect. At some point, the memory capacity of a human brain is not enough to operate on some input. Thus we would need external memory. So, the human brain together with external memory is Turing complete, obviously. But this is maybe not the interesting question (and also assumes that a human brain could be as large as needed to encode any Turing machine program, so there is no upper size of a human brain, which is not really true). More interesting is the question of just the human brain itself, without any external memory. Or basically, how to define the abstract machine or the concept of a human brain? Does this concept allows for infinite memory? Is this straightforward to define?
There is one aspect of the memory in a human brain which is a bit different than in a standard RNN: It can generalize to a high degree, and the addressing mechanism to access certain activations is different. Also, it has some kind of adaptive weights (which however also only can store finite information). So, considering this, it's actually already more similar to a NTM than just a RNN. There are many different kinds of memory in the human brain. Some of it is just as static as a RNN (like the current activations) but other types allow for associative memory like a NTM. And thus the human brain is also similar to a PC, which also has finite memory, whereas the concept of a PC has infinite memory.
So maybe we can say that a concept of a human brain has infinite memory as well, due to the associative memory, and also can be as large as needed to encode any program, and thus the concept of a human brain is Turing complete. Maybe this should be called an abstract human brain (instance of an abstract machine).
常规前馈神经网络不完整。实际上,它们相当于一个复杂的数学函数,可以执行大量计算,但没有任何执行循环或其他控制流操作的能力。
但是,如果您通过某种方式连接神经网络来访问有状态的环境,那么它可以制成图灵完备的机器。
作为一个最简单的例子,您可以重新创建一个经典式的图灵机,其中:
您可以训练的 动作神经网络来模拟任何所需图灵机状态表/配置的动作(也许通过对另一个图灵机的动作进行监督学习?)
注意:使用某种形式的状态反馈重复运行前馈网络的想法本质上等同于循环神经网络。因此,您可以将循环神经网络加上视为图灵完备的重复运行它的逻辑。您需要额外的逻辑(超出网络本身)来确保图灵完备性,因为有必要处理诸如终止、重复和 IO 之类的事情。
Regular feedforward neural networks are not turing complete. They are, in effect, equivalent to a single complicated mathematical function that may do quite a lot of calculations but doesn't have any ability perform looping or other control flow operations.
However, if you wire up a neural network with some way to access a stateful environment then it can be be made into a turing complete machine.
As a most trivial example, you could recreate a classic-style Turing machine where:
You could then train the neural network to emulate the actions of any desired turing machine state table / configuration (perhaps by supervised learning on the actions of another turing machine?)
Note: The idea of running a feedforward net repeatedly with some form of state feedback is essentially equivalent to a recurrent neural network. So you can think of a recurrent neural network plus the logic that runs it repeatedly as being Turing complete. You need the extra logic (over and above the network itself) to ensure Turing completeness because it is necessary to handle things like termination, repetition and IO.
当现代计算机被称为“图灵完备”时,图灵所描述的无限存储设备有一个不言而喻的例外,这在有限的物理计算设备上显然是不可能的。如果计算设备可以完成图灵机可以做的所有事情(不考虑无限存储),那么对于所有实际意图和目的来说,它都是图灵完备的。根据这个不太严格的图灵完备性定义,是的,许多神经网络可能是图灵完备的。
当然可以创建一个非图灵完整的模型。
When modern computers are said to be Turing Complete there is an unspoken exception for the infinite storage device Turing described, which is obviously an impossibilty on a finite physical computation device. If a computation device can do everything a Turing machine can do (infinite storage not withstanding) it is Turing complete for all practical intents and purposes. By this less strict definition of Turing completeness, yes, its possible that many neural networks are Turing complete.
It is of course possible to create one that is not Turing complete.
循环神经网络的图灵完备性可能意味着:每个图灵机(具有有限状态头和无限磁带)的(有限)转移表可以通过有限循环神经网络(有限多个神经元和有限多个神经元)来建模。州,尤其是只有两个州)。转换表定义了三个函数:
next-state(current-state,current-symbol)
next-symbol(current-state, current-symbol) -symbol)
direction(current-state,current-symbol)
这是如何循环神经网络可以执行此任务(只是一个非常原始的草图):
绿色神经元读取中的符号当前单元格(以二进制表示),灰色神经元(最初是静音)确定当前状态,红色神经元将新符号写入当前单元格,黄色神经元确定是向左还是向右。蓝色神经元是内部神经元(最初是静音的)。
据称,每台图灵机都存在这样一个循环神经网络。
我想知道是否有一种系统的方法可以根据给定的转换表构建这样的网络。
Turing-completeness of recurrent neural networks could mean: the (finite) transition tables of each and every Turing machine (with a finite-state head and an infinite tape) can be modelled by a finite recurrent neural network (finitely many neurons with finitely many states, especially only two states). The transition tables define three functions:
next-state(current-state,current-symbol)
next-symbol(current-state, current-symbol)
direction(current-state,current-symbol)
This is how a recurrent neural network may perform this task (just a very raw sketch):
The green neurons read the symbol in the current cell (in binary representation), the gray neurons (initally mute) determine the current state, the red neurons write the new symbol to the current cell, the yellow neurons determine whether to go left or right. The blue neurons are the inner neurons (initially mute).
The claim is, that for each and every Turing machine there is such a recurrent neural network.
I wonder if there is a systematic way to construct such a network from given transition tables.
为了部分解决你的第二个问题:
神经网络具有通用逼近器的属性 -也就是说,它们可以以任意精度逼近任何函数。正是“精确度”部分使神经网络不必是无限的。
To partially address your second question:
Neural Networks have the property of being universal approximators - that is, they can approximate any function to an arbitrary degree of precision. It's the "degree of precision" part that keeps the Neural Network from needing to be infinite.
我认为图灵完备性的概念并不是为了告诉我们特定的计算机是否可以执行特定的任务。
相反,它的目的是判断特定语言是否能够表达特定任务。也就是说,我想说这实际上是关于表达算法而不是执行算法。
由于神经网络没有语言,所以这是一个用神经网络表达算法的问题,而不是该网络的能力的问题。所以我不知道你最后一个问题的答案!
I think the concept of Turing completeness is not intended to tell us whether a particular computer can perform a particular task.
Rather, it purposes to tell whether a particular language is capable of expressing a particular task. That is, I'd say it's really about expressing an algorithm not performing it.
As neural networks don't have a language, it's a question of expressing an algorithm in terms of a neural network, rather than the capability of that network. So I don't know the answer to the last bit of your question!
我认为关于图灵机的一个重要点是,对于任何给定的输入和程序,假设机器停止一段时间,它只需要有限数量的磁带。这就是为什么我会说“图灵完备”这个术语很有用:您只需要有限的内存就可以在某些特定输入上运行一个特定的图灵完备程序(如果程序停止)。但如果你有一个非图灵完备的机器/语言/技术,它就无法模拟某些算法,无论你添加多少内存。
I think an important point about the turing machine is that for any given input and program, the machine will only need a finite amount of tape, assuming it halts some time. That's why I would say the term "turing complete" is useful: You only need finite memory to run one specific turing complete program on some specific input (if the program halts). But if you have a non-turing-complete machine/language/technology, it won't be able to simulate certain algorithms, not matter how much memory you add.