同像和“不受限制”的。自修改代码 + Lisp真的可以自我修改吗?
我承认我对 Lisp 的了解非常少。然而,我对这门语言非常感兴趣,并计划在不久的将来开始认真学习它。我对这些问题的理解无疑是有缺陷的,所以如果我说了什么明显错误的内容,请评论并纠正我,而不是投反对票。
真正的同像和自修改语言
我正在寻找同时支持同像性(代码与数据具有相同表示)和无限制自修改(无限制含义)的编程语言示例您可以更改正在运行的代码的各个方面,而不仅仅是发出新代码或更改函数指针/委托。)
到目前为止,我发现的只有三个符合此条件的示例:
- 机器代码。同像,一切都是数字。可不受限制地修改,因为它包含指针,可用于操作任何内存地址,无论该地址是否保存代码或数据。
- 马尔博格。与机器代码相同的推理。每条指令在执行后都会修改
- DNA。不是编程语言,但仍然很有趣。它不像机器代码那样可以自我修改;其中实际指令+数据被修改到位。然而,它是自我复制的,并且可以根据其先前的状态突变/进化(具有副作用,例如辐射时不时地搞砸它)。无论如何,这只是一种间接的自我改造方式。简而言之,DNA 可以自我修改,但它是通过完整地复制自身以及相关突变来实现这一点的。 DNA 的物理串是“不可变的”。
为什么 Lisp 不在这个列表中
Lisp 不在这个列表中是因为在我看来 Lisp 只是几乎 同像,并且只支持受限的自我修改。您可以执行类似的操作
(+ 1 2 3)
,这将执行与
(eval '(+ 1 2 3))
第一个版本 (+ 1 2 3)
是原始代码,而在第二个版本中它是数据相同的操作。通过假设这个陈述的真实性,可以说 Lisp 甚至不是同调的。代码与数据具有相同的表示形式,因为它们都是列表/树/S 表达式。但事实上,你必须明确标记这些列表/树/S 表达式中哪些是代码,哪些对我来说是数据,这似乎表明 Lisp 并不是同音的。这些表示非常相似,但它们在微小细节上有所不同,您必须实际说明您正在处理代码还是数据。这无论如何都不是一件坏事(事实上,任何其他事情都是疯狂的),但它凸显了 Lisp 和机器代码之间的区别。在机器代码中,您不必显式标记哪些数字是指令,哪些是指针,哪些是数据。在实际需要解释之前,一切都只是一个数字,此时它可以是其中任何一个。
这是反对无限制自我修改的更强有力的理由。当然,您可以获取代表某些代码的列表并对其进行操作。例如更改
'(+ 1 2 3)
为
'(+ 1 4 3)
然后通过 eval
运行它。但是当您这样做时,您只是编译一些代码并运行它。您不会修改现有代码,您只是发出并运行新代码。 C# 可以使用表达式树做完全相同的事情,即使格式不太方便(这是由于 C# 代码与其 AST 具有不同的表示形式,这与 Lisp 不同,Lisp 是它自己的 AST )。您实际上可以获取整个源文件并在运行时开始修改整个源文件,并且对源文件所做的更改会对程序行为产生实时影响吗?
除非有某种方法可以做到这一点,否则 Lisp 既不是同调的,也不是自我修改的。 (为了推迟对定义的争论,Lisp 的同像或自我修改程度与机器代码不同。)
使 Lisp 同像/无限制自我修改的方法
我可以看到 3 种可能的方法来使 Lisp 与机器代码一样同像/可自我修改。
- 非冯诺依曼架构。如果有人能够发明一些令人惊叹的假设机器,其中程序的最低级别表示是可以直接执行的 AST(无需进一步编译)。在这样的机器上,AST 将代表可执行指令和数据。不幸的是,问题还没有解决,因为 AST 仍然必须是代码或数据。 eval 函数的出现并不会改变这一点。在机器代码中,您可以根据需要在代码和数据之间来回切换。而对于 eval 和 Lisp,一旦您将某个列表从数据“评估”为代码并执行它,就无法再次将该列表作为数据返回。事实上,该列表已永远消失并已被它的值所取代。我们会错过一些关键的东西,而这恰好是指针。
- 列表标签。如果要求每个列表也有一个唯一的标签,则可以通过针对具有给定标签的列表运行函数来进行间接自我修改。与延续相结合,这最终将允许自修改代码,就像机器代码具有它一样。标签相当于机器代码内存地址。作为一个例子,考虑一个 Lisp 程序,其中 AST 的顶部节点具有标签“main”。然后,在 main 内部,您可以执行一个函数,该函数接受一个标签、一个整数、一个原子,并将原子复制到列表中,其标签与提供给函数的标签相匹配,位于整数指定的索引处。然后只需在 main 上调用当前的延续即可。就这样,自我修改代码。
- Lisp 宏。我还没有花时间去理解 Lisp 宏,事实上它们可能完全符合我的想法。
第 1 点与第 2 点相结合将产生一个完全自我修改的 Lisp。前提是所描述的神奇 Lisp 机器能够被生产出来。 2. 单独可以产生一个自我修改的 Lisp,但是在冯·诺依曼架构上的实现可能效率极低。
问题
- 除了机器代码、dna 和 malbolge 之外,还有什么语言可以完全自我修改并且是同像的?
- (如果您对上述文本进行了 tl;dr,请不要费心回答)。 lisp真的是同像+自修改吗?如果你这么说,你能准确地引用我的论点中我误入歧途的地方吗?
附录
具有不受限制的自修改但无同调性的语言
- 汇编。该代码使用单词而不是数字,因此失去了同调性,但它仍然具有指针,这保留了对内存的完全控制并允许不受限制的自我修改。
- 任何使用原始指针的语言。例如 C/C++/Objective C。与
- 包含虚拟指针的汇编 JIT 语言相同的参数。例如,C#/.net 在不安全的上下文中运行。与 Assembly 相同的论点。
其他可能相关/有趣的概念和语言: Lisp、Ruby、Snobol、Forth 以及它的编译时元编程、Smalltalk 以及它的反射、无类型 lambda 演算,其属性是一切都是函数(这意味着假设我们可以发明一台直接执行 lambda 演算的机器,lambda 演算是同像的,而冯·诺依曼机器代码在所述机器上运行时则不会[哥德尔定理将是可执行的,哈哈,可怕的想法。 :P])
I will be forward in admiting that my knowledge of Lisp is extremely minimal. However I am extremely interested in the language and plan to begin seriously learning it in the near future. My understanding of these issues is no doubt flawed, so if I say anything which is blatently wrong, please comment and correct me rather than downvoting.
Truly Homoiconic and Self-modifiable languages
I'm looking for examples of programming languages which support both Homoiconicity (Code has the same representation as data) and unrestricted self modification (Unrestricted meaning that you can change every aspect of your running code, not merely emit new code or change function pointers/delegates.)
There are but three examples I have found so far which fit this criteria:
- Machine code. Homoiconic in that everything is a number. Unrestrictedly modifiable in that it includes pointers, which can be used to manipulate any memory address regardless of whether that address holds code or data.
- Malbolge. Same reasoning as Machine code. Every instruction modifies itself after being executed
- DNA. Not a programming language, but still interesting. It isn't self modifying in the same sense as machine code is; Where the actual instructions + data are modified in place. However it is self replicating and can mutate/evolve according to it's previous state (With side effects such as radiation screwing it up every now and then). This is just an indirect way of self modification anyway. In short, DNA can self modify, but it does so by reproducing itself in it's entirity along with the relevant mutations. A physical string of DNA is "immutable".
Why Lisp is not on this list
Lisp is not on that list because it appears to me that Lisp is only almost Homoiconic, and only supports restricted self modification. You can do something like
(+ 1 2 3)
which will do the same thing as
(eval '(+ 1 2 3))
In the first version (+ 1 2 3)
is raw code, whereas in the second version it is data. By assuming the truth of this statement it can be argued that Lisp isn't even homiconic. The code has the same representation as data in the sense that they are both lists/trees/S-expressions. But the fact that you have to explicitly mark which of these lists/trees/S-expressions are code and which are data to me seems to say that Lisp is not homiconic after all. The representations are extremely similar, but they differ in the tiny detail that you have to actually say whether you are dealing with code or data. This is not in any way a bad thing (In fact anything else would be madness), but it highlights a difference between Lisp and machine code. In machine code you don't have to explicitly mark which numbers are instructions, which are pointers, and which are data. Everything is simply a number until an interpretation is actually required, at which point it could be any of those things.
It's an even stronger case against unrestricted self modification. Sure, you can take a list that represents some code and manipulate it. For example changing
'(+ 1 2 3)
to
'(+ 1 4 3)
And then you run it through eval
. But when you do this you're just compiling some code and running it. You're not modifying existing code, you're just emitting and running new code. C# can do exactly the same thing using expression trees, even if in a less convenient format (which arises due to C# code having a different representation to its AST, as opposed to Lisp, which is its own AST). Can you actually take an entire source file and start modifying that entire source file as it is running, with the changes made to the source file having realtime effects on the program behaviour?
Unless there is some way to do this, Lisp is neither homiconic nor self modifying. (To put off an argument over definitions, Lisp is not homoiconic or self modifying to the same degree that machine code is.)
Ways to make Lisp Homoiconic/Unrestrictedly self-modifiable
I can see 3 potential ways to make Lisp as homoiconic/self-modifiable as machine code.
- Non-Von-Neumann architecture. If someone could invent some amazing hypothetical machine where the lowest level representation of programs is an AST which can be executed directly (No further compilation is necessary). On such a machine an AST would represent both executable instructions, as well as data. Unfortunately the problem won't have been solved, because the AST still has to be either code or data. The prescence of an eval function doesn't change this. In machine code you can flip back and forth between code and data as much as you want. Whereas with eval and Lisp once you've "evaled" some list from data to code and executed it, there's no way to get that list back as data again. In fact, that list is gone forever and has been replaced with it's value. We'd be missing something crucial, which just so happens to be pointers.
- List Labels. If it was a requirement that every list also have a unique label, it would be possible to do indirect self modification by running functions against a list with a given label. Combined with continuations this would finally allow for self modifying code in the same sense that machine code has it. The labels are equivilant to machine code memory addresses. As an example, consider a Lisp program where the top node of the AST has the label "main". Inside main you could then execute a function that takes a label, an Integer, an Atom, and copies the atom to the List with a label that matches the one supplied to the function, at the index specified by the Integer. Then just call with current continuation on main. There you go, self modifying code.
- Lisp Macros. I haven't taken the time to understand Lisp macros, and they may in fact do exactly what I'm thinking about.
Point 1. combined with 2. would produce a totally self-modifying Lisp. Provided that the magical Lisp machine described could be produced. 2. alone could produce a self-modifying Lisp, however the implementation on a Von Neumann architecture could be extremely innefficient.
The Questions
- Are the any languages other than machine code, dna and malbolge which can do total self modification and are homoiconic?
- (DO NOT bother answering if you did a tl;dr at the above text). Is lisp really homoiconic + self modifying? If you say so, can you quote exactly where in my argument I went astray?
Appendix
Languages with unrestricted self modification but no homiconicity
- Assembly. The code uses words as opposed to numbers, so loses homiconicity, but it still has pointers, which retains the total control over memory and allows for unrestricted self modification.
- Any language that uses raw pointers. For example C/C++/Objective C. Same argument as Assembly
- JIT languages that include virtual pointers. For example C#/.net running in an unsafe context. Same argument as Assembly.
Other concepts and languages that may be somehow relevant/interesting:
Lisp, Ruby, Snobol, Forth and it's compile time metaprogramming, Smalltalk and it's reflection, the untyped lambda calculus with it's property that everything is a function (Which kind of implies that assuming we could invent a machine which executes lambda calculus directly, lambda calculus would be homoiconic and Von Neumann machine code would not when run on said machine. [And Godels theorem would be executable. Haha, scary thought :P])
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这不是真的。在第一个版本中,列表
(+ 1 2 3)
(即数据)被提供给解释器来执行,即被解释为代码。必须将 s 表达式标记为特定上下文中的代码或数据这一事实并不意味着 Lisp 是非同象的。同像的一点是,所有程序都是数据,而不是所有数据都是程序,所以两者还是有区别的。在 Lisp 中,
(1 2 3)
是一个有效的列表,但不是一个有效的程序,因为整数不是函数。[如果我们看看另一种伟大的同像编程语言 Prolog,那么我们会看到相同的现象:我们可以构建一个数据结构
foo(X, 1, bar)
,但没有foo
,我们无法执行它。另外,变量不能是谓词或事实的名称,因此X.
永远不是一个有效的程序。]Lisp 在很大程度上是自我修改的。例如,以下是如何更改函数的定义:
说明:在
[1]
中,我们将函数foo
定义为 add-1 函数。在[2]
中,我们将bar
定义为add-2函数。在[3]
处,我们将foo
重置为add-2函数。在[4]
处,我们看到我们已经成功修改了foo
。This is not true. In the first version, the list
(+ 1 2 3)
, which is data, is being fed to the interpreter to be executed, i.e. to be interpreted as code. The fact that you have to mark s-expressions as being code or data in a specific context does not make Lisp non-homoiconic.The point of homoiconicity is that all programs are data, not that all data are programs, so there is still a difference between the two. In Lisp,
(1 2 3)
is a valid list but not a valid program since an integer is not a function.[If we look at the other great homoiconic programming language, Prolog, then we see the same phenomenon: we can build a data structure
foo(X, 1, bar)
, but without a definition offoo
, we can't execute it. Also, variables cannot be the names of predicates or facts, soX.
is never a valid program.]Lisp is self-modifying to a great degree. E.g., here's how to change the definition of a function:
Explanation: at
[1]
, we defined the functionfoo
to be the add-1 function. At[2]
, we definedbar
to be the add-2 function. At[3]
, we resetfoo
to the add-2 function. At[4]
, we see that we've successfully modifiedfoo
.Lisp 是一个编程语言家族。该家族的成员在能力和实施技术方面存在很大差异。对此有两种不同的解释:
Lisp 是一个共享一些重叠功能集的语言家族。这包括从第一个 Lisp 到 Maclisp、Scheme、Logo、Common Lisp、Clojure 以及数百种不同语言及其实现的所有内容。
Lisp 还有一个主要的语言分支,其名称中也大多带有“Lisp”:Lisp、MacLisp、Common Lisp、Emacs Lisp,...
随着时间的推移,人们投入了大量的研究来改进语言(使其更加“功能化”或使其更加面向对象,或两者兼而有之)并改进实施技术。
例如,Common Lisp 支持编译、各种优化等,允许开发人员在需要灵活性和功能之间取得一定平衡的大型项目中使用它。编译后的函数就是机器代码,而不再是由列表、符号、字符串和数字组成的数据结构。
Common Lisp 允许实现创建静态代码。同时,它为受控运行时修改留下了一些空间(例如,通过使用运行时编译器、加载代码、评估代码、替换代码……)。
OTOH 如果您有一个带有解释器的 Lisp 实现,并且解释器可能使用 Lisp 数据结构来表示源,那么您可以通过更改列表结构来更改正在运行的程序。 Lisp 方言的实现允许这样做。一个典型的事情是宏的使用,它可以由这样的系统在运行时计算。其他一些 Lisp 有所谓的 Fexprs,它们是类似的机制(但通常无法有效编译)。
在基于 Common Lisp 的应用程序(例如,用 Lisp 编写的 CAD 系统)中,大部分源信息已被交付工具删除,这是不可能的。人们将拥有一个机器代码的可执行文件,它消除了大部分运行时灵活性。
同像性也不是一个非常明确定义的概念。对于Lisp我更喜欢我们说源代码可以变成数据,因为源代码使用了s-表达式的外部表示,这是一种数据序列化格式。但并非每个 s 表达式都是有效的 Lisp 程序。此外,源代码作为 Lisp 数据的内部表示无论如何都不是“标志性的”。 Lisp 具有作为外部 s 表达式的源代码,在阅读源代码之后,它具有作为 Lisp 数据的内部表示。 READ 读取它,PRINT 打印它,EVAL 评估它。
Lisp 中还有其他方法来提供对正在运行程序的 Lisp 和程序的访问:
CLOS(通用 Lisp 对象系统)中的元对象协议就是这样一个示例
3Lisp 提供了一个无限的解释器塔。有一个 Lisp 解释器在运行你的程序。这个 Lisp 解释器由另一个 Lisp 解释器运行,而这个解释器又在另一个 Lisp 解释器中运行,而那个解释器也是如此……
Lisp is a family of programming languages. Members of this family are widely different in their capabilities and implementation techniques. There are two different interpretations of this:
Lisp is a family of languages which share some overlapping feature set. This includes everything from the first Lisp, to Maclisp, Scheme, Logo, Common Lisp, Clojure and hundreds of different languages and their implementations.
Lisp also has a main branch of languages which also mostly have 'Lisp' in their name: Lisp, MacLisp, Common Lisp, Emacs Lisp, ...
Lots of research has been invested over time to improve the languages (make it more 'Functional' or make it more object-oriented, or both) and to improve implementation techniques.
Common Lisp for example supports compilation, various optimizations, and more to allow developers to use it in large projects where some balance between flexibility and features is needed. A compiled function is then machine code and no longer a data structure made of lists, symbols, strings and numbers.
Common Lisp allows implementations to create static code. At the same time it leaves some place for controlled runtime modifications (for example by using a runtime compiler, loading code, evaluating code, replacing code, ...).
OTOH if you have a Lisp implementation with an interpreter and additionally the interpreter might use Lisp data structures to represent source, then you are able to change running programs for example by changing the list structure. There are implementations of dialects of Lisp which allow that. One typical thing is the use of macros, which can be computed at runtime by such a system. Some other Lisps have so-called Fexprs, which are a similar mechanism (but which usually can't be effectively compiled).
In a Common Lisp based application (say, a CAD system written in Lisp) where much of the source information has been removed by a delivery tool, this would not be possible. One would have a single executable of machine code, which has much of the runtime flexibility removed.
Homoiconicity is also not a very well defined concept. For Lisp I like more that we say that source code can be turned into data, because the source code uses the external representation of s-expressions, which is a data serialization format. But not every s-expression is a valid Lisp program. Also an internal representation of source code as Lisp data is not 'iconic' in any way. Lisp has source code as external s-expressions and after reading the source code, it has an internal representation as Lisp data. READ reads it, PRINT prints it and EVAL evaluates it.
There are also other approaches in Lisp to provide access to the Lisp which is running your program and to the program:
the meta-object protocol in CLOS (the Common Lisp Object System) is such an example
3Lisp provides an infinite tower of interpreters. There is a Lisp interpreter running your program. This Lisp interpreter is ran by another Lisp interpreter, which is again running in another one, and that one too...
宏可以避免引用,如果这就是您所追求的:
(+ - 5 2)
是代码还是数据?在写入时它看起来像数据。经过宏扩展时间,它看起来像代码。如果foo
的定义在其他地方,它可能很容易被(错误地)认为是一个函数,在这种情况下(+ - 5 2)
将是被认为是行为类似于代码的数据的代码。Macros can avoid quoting, if that's what you're after:
Is
(+ - 5 2)
code or data? At write-time it seems like data. After macro-expansion time it seems like code. And if the definition offoo
was somewhere else, it could just as easily be (incorrectly) thought of as a function, in which case(+ - 5 2)
will be thought of as code that behaves like data that behaves like code.我在这里遇到了一个粉丝,我以前也说过,但是请阅读 Paul Graham 的 On Lisp
如果你想了解 Lisp 宏。就实现原本不可行的修改而言,它们是一件大事。此外,我认为区分 Lisp 语言家族、给定的 Lisp 和给定的 Lisp 实现很重要。
我想我对你的论点的主要问题出现在“为什么 Lisp 不在这个列表中”之后的第一段,并且与 Lisp 中的 REPL 有关。当您在解释器中输入 exp (+ 1 2 3) 时,您实际上是在调用列表 (+ 1 2 3) 上的函数 EVAL。您描述的“原始代码”实际上是“数据”,被馈送到其他 Lisp 代码中,它只是特定上下文中的数据。
I come across as a fanboy here, and I've said it before, but read Paul Graham's On Lisp
if you want to learn about Lisp Macros. They're a big deal in terms of enabling modifications that would be otherwise infeasible. In addition, I think it's important to distinguish here between Lisp the language family, a given Lisp, and a given Lisp implementation.
I suppose the main issue I take with your argument comes in the first paragraph after "Why Lisp is not on this list.", and has to do with the REPL in Lisp. When you enter the exp (+ 1 2 3) into the interpreter, you are in fact calling the function EVAL on the list (+ 1 2 3). The "raw code" you describe is in fact "data", being fed to other lisp code, it's just data in a specific context.