如何编写所有可计算函数的枚举?
动机:我希望能够通过使用自然数而不是函数,在没有一阶函数的语言中使用玩具函数式编程。
通用函数是函数 f : N → (N -> N),相当于 f : N * N -> N 枚举所有可能的可计算函数。换句话说,有一个数 k 使得 f(k) 是平方函数,有一个数 j 使得 f(j) 是第 n 个素数函数等等。
要编写这样一个函数,可以采用任何图灵-完整的语言(编程语言编译器、lambda演算、图灵机...)并枚举所有程序。我不仅希望允许求值,还允许对加法、组合、柯里化等函数进行操作。例如,给定两个函数 f,g 的索引,我想知道函数 f+g 或由 g 组成的 f 的索引是什么。这将允许“玩具函数式编程”。
编写这样的代码库有什么好方法?我并不是在寻找一个难以计算 10 阶乘的简约图灵 tarpit,也不想编写一个高级编译器。它应该具有一些基本功能,例如加法和编写循环的可能性,但仅此而已。
欢迎使用所有高级语言的解决方案。伪代码、Haskell 和 Python 是首选。您可以假设任意精度的算术。不允许使用 eval
或类似内容。
澄清:枚举函数将包含所有部分递归(可计算)函数 - 这包括不' t 在某些输入上停止。在这种情况下,通用功能将挂起;当然这是不可避免的。另请参阅:m-递归函数 - http://en.wikipedia.org/wiki/ M-递归函数。
Motivation: I'd like to be able to use toy functional programming in languages without first-order functions, by using natural numbers instead of functions.
A universal function is a function f : N -> (N -> N), equivalently f : N * N -> N that enumerates all possible computable functions. In other words, there's a number k such that f(k) is the squaring function, there's a number j such that f(j) is the n-th prime function etc.
To write such a function, one can take any Turing-complete language (programming language compiler, lambda calculus, Turing machines...) and enumerate all programs. I'd like to allow not only evaluation, but also operations on functions like addition, composition, currying. For example, given indices of two functions f,g I'd like to know what is the index of the function f+g, or f composed with g. This would allow "toy functional programming".
What is a good way to write such code library? I'm not looking for a minimalistic Turing tarpit that will struggle to compute factorial of 10, nor I don't want to write an advanced compiler. It should have some basic functions like addition and possibility to write loop, but not much more.
Solutions in all high-level languages are welcome. Pseudocode, Haskell and Python are preferred. You can assume arbitrary precision arithmetic. Using eval
or similar is not allowed.
Clarification: Enumerated functions will consist of all partial recursive (computable) ones - this includes functions that don't halt on some inputs. The universal function will hang in that cases; of course this is unavoidable. See also: m-recursive functions - http://en.wikipedia.org/wiki/Μ-recursive_function.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
你想要的叫做口译员。
首先,任何具有您想要的属性的枚举都不会适合您想要在前 2^32 甚至前 2^64 整数中操作的有趣函数。因此,您将需要更大的整数,在内存中的某个位置分配并通过指针引用。
那么为什么不在任何现有语法中使用表示程序的字符(字符串)数组呢?
如果你高兴的话,可以将这样的字符串视为整数。计算
f1()+f2()
的函数编号是由(f1的表示)、“+”和(f2的表示)组成的字符串。你明白了......这种方法不具有函数表示的唯一性,这可能暗示在你的问题中(我不确定)。我确信的是,表示的唯一性与对函数表示进行简单的、甚至可计算的组合操作是不兼容的——例如,如果不是这种情况,那么对于停止问题会有一个简单的解决方案。
What you want is called an interpreter.
First, any enumeration with the properties you want is not going to fit the interesting functions you want to manipulate in the first 2^32, or even the first 2^64, integers. So you will need larger integers, allocated somewhere in memory and referenced through a pointer.
Why not use arrays of chars (strings) representing program in any existing syntax then?
Consider such a string as an integer if that makes you happy. The number of the function to compute
f1()+f2()
is the string made of (representation of f1), "+", and (representation of f2). You get the idea...What this approach does not have is unicity of the representation of a function, that was maybe implied in your question (I am not sure). What I am sure of is that unicity of representation is incompatible with having simple, or even computable, composition operations on function representations -- For instance, there would be an easy solution to the Halting problem if it wasn't the case.
虽然枚举某种语言中所有可能的表达式并不太难,但您无法将它们限制为那些表示终止函数的表达式。
但是,如果您对终止不感兴趣,那么使用组合器(为了实用而引入一些算术原语)可能是最好的方法,因为您可以避免以这种方式引入变量名。
While it is not too hard to enumerate all possible expressions in some language, you won't be able to restrict these to those expressions that denote terminating functions.
But if you are not interested in termination, using combinators (with some arithmetic primitives thrown in for usefulness) might be the best way, since you avoid introducing variable names that way.
正如 Pascal 所说,你想要的是一个解释器,但可以做得更好:直接使用处理器作为解释器。
将数字 N(例如,作为某个大 int 数组)直接送入缓冲区,并将该缓冲区作为机器代码执行。
对于计算机能够执行的每一个可能的功能,都存在一个 N。不幸的是。并非每个 N 都是有效程序(这没有被要求)或终止程序(这是不可能的)。
另一方面,该功能将产生诸如《魔兽世界》、Microsoft Office 17(包括 Service Pack 6)和 Windows 9 等宝石。
As Pascal said, what you want is a interpreter, but one can do even better: Use the processor as interpreter directly.
Feed the number N (say, as some big int array) directly to a buffer and execute this buffer as machinecode.
For every possible function you computer is able to execute there exists a N. Unfortunatly. not every N is a valid program (this wasn't requested) or the terminating program (which isn't possible).
On the other hand, this function will produce such gems as World of Warcraft, Microsoft Office 17 including Service Pack 6 and Windows 9.
这不是一个简单的问题。我想你必须从一个函数生成器开始,它能够一一生成所有函数。这将产生一个枚举。
既然你必须处理多个无尽的维度......让我们考虑一下。
让我们将问题简化为具有 n 个参数的函数和基本运算 +、-、*、/。
让我们只用一个操作来构建所有函数:
a + a
a + b
一个 - 一个
a - b
一个*一个
a * b
一个/一个
a / b
我想很容易看出其中一些函数比其他函数更有意义,并且其中一些函数可能是相等的,但至少,它是一个可以通过循环生成的映射。
现在,在下一次迭代中,人们可以轻松地向每个函数添加
。 之后,您将获得一个巨大的函数列表,您可以重复第二步。
由于 是一个估计所有更复杂函数(例如 sin 和 log(泰勒级数))的函数,因此这些函数也应该包含在该函数空间中。
这有帮助吗?请随意编辑这篇文章!
只需重新阅读您的帖子即可。如果您想枚举所有编程函数而不仅仅是数字一次,我想它会更复杂。我想,通过压缩函数的源代码并将 zip 文件视为一个大数字来使用映射“函数 <-> 数字”是有意义的。相反,您可以尝试解压缩任何数字,看看它是否创建有用的函数:-)但我猜您会拥有很多甚至不是 zip 文件的数字。
但它会满足您的要求,即每个函数都有一个代表它的数字:-)
not an easy question. I guess you have to start with a function generator which is able generate all functions one by one. This will result in an enumeration.
Since you have to deal with multiple endless dimensions... let's think about it.
Let's reduce the problem to functions with n parameters and the basic operations +, -, *, /.
Let's build all functions with only one operation:
a + a
a + b
a - a
a - b
a * a
a * b
a / a
a / b
I guess it is easy to see that some of these functions make more sense as others and some of them could be equal, but at least, it is a mapping which can be generated through a loop.
Now in the next iteration, one could easily add to each of these functions
Afterwards you have a huge list of functions for which you can repeat step two.
Since the is a function which estimates all more complex functions like sin and log (taylor series), these should be covered in this functions space too.
Does this help? Feel free to edit this post!
Just re-read your post. If you want to enumerate all programmatic functions and not only numeric once, I guess it will be more complex. I guess then it just would make sense to work with a mapping "function <-> number" by zipping the source of your function and treating the zip file as a large number. The other way around, you can try to unzip any number and see if it creates a useful function :-) But I guess you will have lots of number which are not even zip files.
But it would fullfill your requirement that for every function there is a number which represents it :-)
您可以使用任何编程语言来确定某物是否是程序,并按字典顺序列出所有程序。为了至少避免一点组合爆炸,您可以以规范化形式分配用户定义的名称(变量、函数等)。显然,这将产生大量的函数,并且要选出哪些函数真正有用并不容易。任何自动修剪方法要么会排除您实际上想要的某些功能,要么无法将组合爆炸修剪得足够有用,或者两者兼而有之。
这样做的另一个缺点是从数字到函数将非常困难:很难找到比枚举大约四百万亿个函数更好的方法来查找函数 433,457,175,432,167,463。
另一种方法是通过将符号映射到数字并将它们有效地连接来将函数编码为数字。
假设符号为+、-、:=、==、<、if、then、endif、do、end_do_condition、enddo 和语句分隔符。那里有 11 个符号,没有变量,是一个非常小的集合,不包含函数调用之类的东西,并且需要您自己进行乘法和除法。 (我实际上不确定如果没有一两个逻辑运算符,这是否可行。)添加五个变量名,您就得到了一种具有 4 位字符的编程语言。这意味着 64 位无符号整数最多可容纳 16 个字符。
一旦掌握了这一点,函数之间所有可能的关系都将可以表示为算术关系,但这是一种极其复杂的关系,在实践中很难正确处理。
简而言之,虽然这在理论上是可能的,但在实践中却显得过于笨拙。用您选择的非函数式语言为函数式语言编写解释器可能会更容易。
You can take any programming language such that you can determine whether something is a program or not, and list all programs in lexicographic order. To avoid at least a little of the combinatoric explosion, you can assign user-defined names (variables, functions, etc.) in a normalized form. Obviously, this is going to result in an immense number of functions, and it's not going to be easy to pick out which ones are actually useful. Any automatic method of trimming will either exclude some functions that you're actually going to want, or fail to trim the combinatoric explosion enough to be useful, or both.
The other disadvantage of this is that it's going to be very difficult to go from the number to the function: it's hard to find a better way of finding function 433,457,175,432,167,463 than to enumerate about four hundred quadrillion functions.
The other way is to encode a function into a number by mapping the symbols to numbers, and effectively concatenating them.
Assume that the symbols are +, -, :=, ==, <, if, then, endif, do, end_do_condition, enddo, and a statement delimiter. That's 11 symbols right there, without variables, for a pretty minimal set that doesn't include anything like a function call, and requires that you multiply and divide yourself. (I'm not actually sure this would work without a logical operator or two.) Add five variable names, and you've got a programming language with 4-bit characters. This means that a maximum of sixteen characters will fit into a 64-bit unsigned integer.
Once you've got this, all the possible relations between functions are going to be representable as an arithmetic relation, but an immensely complicated one that will be far to complex to get right in practice.
In short, while this is theoretically possible, it's going to be far too clumsy in practice. It would probably be easier to write an interpreter for a functional language in your nonfunctional language of choice.
我不太确定这是否可以完成......感觉它违背了图灵丘奇论文。为了枚举所有程序,首先您需要一个算法来确定哪些程序有效,哪些程序无效,这是不可能的......除非您不关心这一点并允许您的语言中存在不正确的程序。
但也许正式系统的Godelization可以帮助你......我会尝试使用Lisp,将代码作为数据会有很大帮助。
I'm not very sure if that can be done at all... It feels like it goes against the Turing-Church thesis. In order to enumerate all programs, first you need an algorithm that determines which programs are valid and which ones aren't, and that's impossible... Unless you don't care about that and allow incorrect programs in your language.
But maybe the Godelization of a formal system could help you... I'd try with Lisp, having the code as data could help a lot.
不确定我是否理解。但有一件事 - 你无法枚举所有可能的可计算函数。简短的回答:因为否则就会有通用的防病毒软件。长答案:因为如果存在这样的枚举,那么在该集合中也会有计算枚举本身的函数。就像罗素悖论一样。
你的问题的另一个答案是 - 你想“列出”所有可能的可计算函数;为此,您可能希望将它们表示为素数,并将它们的组合用作乘法。这将保证唯一性。因式分解将为您提供相反的函数。
Not sure I understand. One thing though - you can't enumerate all possible computable functions. Short answer: because otherwise there will be a universal antivirus. Long answer: because if there was such an enumeration, you would have in that set also the function that computes the enumeration itself. Like Russel's paradox.
A different answer to your question would be - you want to ``list'' all possible computable functions; to do that, you might want to represent them as prime numbers, and use their composition as multiplication. This would guarantee uniqueness. Factorisation will give you the reverse function.