声明 Lisp 函数“纯”的能力是否会成为现实？有好处吗？

发布于 2024-12-02 03:49:48 字数 691 浏览 2 评论 0原文

我最近读了很多关于 Haskell 的内容，以及它从纯粹中获得的好处< /em> 函数式语言。（我对讨论 Lisp 的 monad 不感兴趣）对我来说，（至少在逻辑上）尽可能地隔离具有副作用的函数是有意义的。我已经大量使用了 setf 和其他破坏性函数，并且我认识到 Lisp 及其（大部分）衍生品中对它们的需求。

我们开始吧：

像 (declare pure) 这样的东西是否可能有助于优化编译器？或者这是一个有争议的问题，因为它已经知道了？
该声明是否有助于证明一个函数或程序，或者至少是一个被声明为纯函数的子集？或者这又是不必要的，因为它对于程序员、编译器和证明者来说已经是显而易见的？
如果没有别的原因，编译器强制使用此声明的函数的纯度并增加 Lisp 程序的可读性/可维护性对程序员有用吗？
这些有什么意义吗？还是我现在太累了，根本无法思考？

我很感激这里的任何见解。欢迎提供有关编译器实现或可证明性的信息。

编辑

澄清一下，我并不打算将这个问题限制在 Common Lisp 中。它显然（我认为）不适用于某些衍生语言，但我也很好奇其他 Lisp 的某些功能是否倾向于支持（或不支持）这种设施。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

云仙小弟 2024-12-09 03:49:48

你有两个答案，但都没有触及真正的问题。

首先，是的，知道函数是纯函数显然是件好事。有大量编译器级别的事情以及用户级别的事情想要知道。鉴于 Lisp 语言非常灵活，您可以稍微扭转一下情况：您只需将声明限制定义中的代码，而不是要求编译器更加努力的“纯粹”声明。这样你就可以保证函数是纯粹的。

您甚至可以使用额外的支持设施来做到这一点——我在对 johanbev 的回答的评论中提到了其中两个：添加不可变绑定和不可变数据结构的概念。我知道在 Common Lisp 中，这些都是非常有问题的，尤其是不可变的绑定（因为 CL 通过“副作用”将代码加载到位）。但这些功能将有助于简化事情，而且它们并不是不可想象的（例如，参见 Racket 实现，它具有不可变对和其他数据结构，并且具有不可变的绑定。

但真正的问题是，即使是一个看起来非常简单的问题也会出现问题（我为此使用了类似方案的语法。）

(define-pure (foo x)
  (cons (+ x 1) (bar)))

似乎很容易分辨。这个函数确实是pure，它不会做任何事情。此外，似乎使用 define-pure 限制主体并仅允许纯代码在这种情况下可以正常工作，并且将允许此定义

现在从问题开始。：

它正在调用 cons，因此它假定它也是纯的。此外，正如我上面提到的，它应该依赖于 cons 。是，所以假设 cons 绑定是简单，因为它是一个已知的内置函数，当然，对 bar 执行相同的操作。
但是cons确实有副作用（即使您谈论的是Racket的不可变对）：它分配一个新的对。这似乎是一个小而可以忽略的问题，但是，例如，如果您允许此类内容出现在纯函数中，那么您将无法自动记忆它们。问题是，有人可能依赖于每个 foo 调用返回一个新的对 - 一个与任何其他现有对都不 eq 的新对。似乎为了让它变得更好，你需要进一步限制纯函数不仅处理不可变的值，而且还处理构造函数并不总是创建新值的值（例如，它可以使用 hash-cons 而不是分配）。
但是该代码还调用 bar ——所以不需要对 bar 做出相同的假设：它必须被称为纯函数，具有不可变的绑定。请特别注意 bar 不接收任何参数 - 因此在这种情况下，编译器不仅可以要求 bar 是一个纯函数，它还可以使用该信息并进行预计算它的价值。毕竟，没有输入的纯函数可以简化为简单值。（顺便说一句，请注意 Haskell 没有零参数函数。）
这带来了另一个大问题。如果 bar 是一个一个输入的函数怎么办？在这种情况下，你会遇到一个错误，并且会抛出一些异常......这不再是纯粹的。异常是副作用。除了其他一切之外，您现在还需要知道 bar 的数量，并且需要避免其他异常。现在，输入 x 怎么样——如果它不是数字会怎样？这也会引发异常，因此您也需要避免它。这意味着您现在需要一个类型系统。
将 (+ x 1) 更改为 (/ 1 x)，您会发现您不仅需要一个类型系统，还需要一个复杂的类型系统足以区分 0。
或者，您可以重新思考整个事情，并拥有永远不会抛出异常的新纯算术运算 - 但由于所有其他限制，您现在离家很远，使用一种完全不同的语言。或者
最后，还有一个 PITA 的副作用：如果 bar 的定义是 (define-pure (bar) (bar)) 怎么办？根据上述所有限制，它当然是纯粹的……但发散是副作用的一种形式，所以即使这样也不再是犹太洁食。（例如，如果你确实让编译器将空函数优化为值，那么对于这个例子，编译器本身将陷入无限循环。）（是的，Haskell 不处理这个问题，它不让它）

You have two answers but neither touch on the real problem.

First, yes, it would obviously be good to know that a function is pure. There's a ton of compiler level things that would like to know that, as well as user level things. Given that lisp languages are so flexible, you could twist things a bit: instead of a "pure" declaration that asks the compiler to try harder or something, you just make the declaration restrict the code in the definition. This way you can guarantee that the function is pure.

You can even do that with additional supporting facilities -- I mentioned two of them in a comment I made to johanbev's answer: add the notion of immutable bindings and immutable data structures. I know that in Common Lisp these are very problematic, especially immutable bindings (since CL loads code by "side-effecting" it into place). But such features will help simplifying things, and they're not inconceivable (see for example the Racket implementation that has immutable pairs and other data structures, and has immutable bindings.

But the real question is what can you do in such restricted functions. Even a very simple looking problem would be infested with issues. (I'm using Scheme-like syntax for this.)

(define-pure (foo x)
  (cons (+ x 1) (bar)))

Seems easy enough to tell that this function is indeed pure, it doesn't do anything . Also, seems that having define-pure restrict the body and allow only pure code would work fine in this case, and will allow this definition.

Now start with the problems:

It's calling cons, so it assumes that it is also known to be pure. In addition, as I mentioned above, it should rely on cons being what it is, so assume that the cons binding is immutable. Easy, since it's a known builtin. Do the same with bar, of course.
But cons does have a side effect (even if you're talking about Racket's immutable pairs): it allocates a new pair. This seems like a minor and ignorable point, but, for example, if you allow such things to appear in pure functions, then you won't be able to auto-memoize them. The problem is that someone might rely on every foo call returning a new pair -- one that is not-eq to any other existing pair. Seems that to make it fine you need to further restrict pure functions to deal not only with immutable values, but also values where the constructor doesn't always create a new value (eg, it could hash-cons instead of allocate).
But that code also calls bar -- so no you need to make the same assumptions on bar: it must be known as a pure function, with an immutable binding. Note specifically that bar receives no arguments -- so in that case the compiler could not only require that bar is a pure function, it could also use that information and pre-compute its value. After all, a pure function with no inputs could be reduced to a plain value. (Note BTW that Haskell doesn't have zero-argument functions.)
And that brings another big issue in. What if bar is a function of one input? In that case you'd have an error, and some exception will get thrown ... and that's no longer pure. Exceptions are side-effects. You now need to know the arity of bar in addition to everything else, and you need to avoid other exceptions. Now, how about that input x -- what happens if it isn't a number? That will throw an exception too, so you need to avoid it too. This means that you now need a type system.
Change that (+ x 1) to (/ 1 x) and you can see that not only do you need a type system, you need one that is sophisticated enough to distinguish 0s.
Alternatively, you could re-think the whole thing and have new pure arithmetic operations that never throw exceptions -- but with all the other restrictions you're now quite a long way from home, with a language that is radically different.
Finally, there's one more side-effect that remains a PITA: what if the definition of bar is (define-pure (bar) (bar))? It certainly is pure according to all of the above restrictions... But diverging is a form of a side effect, so even this is no longer kosher. (For example, if you did make your compiler optimize nullary functions to values, then for this example the compiler itself would get stuck in an infinite loop.) (And yes, Haskell doesn't deal with that, it doesn't make it less of an issue.)

回复收藏 0 原文

辞慾 2024-12-09 03:49:48

给定一个 Lisp 函数，知道它是否是纯函数通常是不可判定的。当然，必要条件和充分条件可以在编译时进行测试。（如果根本不存在不纯操作，则该函数一定是纯函数；如果无条件执行不纯操作，则该函数一定是不纯的；对于更复杂的情况，编译器可以尝试证明该函数是纯函数还是不纯函数，但并非在所有情况下都会成功。）

如果用户可以手动将函数注释为纯函数，则编译器可以 (a.) 更加努力地证明该函数是纯函数，即。在放弃之前花更多时间，或者 (b.) 假设它是，并添加对于不纯函数不正确的优化（例如，记忆结果）。所以，是的，如果假设注释是正确的，将函数注释为纯函数可以帮助编译器。
除了上面的“更加努力”的想法之外，注释对证明东西没有帮助，因为它没有向证明者提供任何信息。，证明者可以在尝试之前假设注释始终存在。）但是，向纯函数附加其纯度的证明可能是有意义的。
编译器可以（a.）在编译时检查纯函数是否确实是纯函数，但这通常是不可判定的，或者（b.）添加代码以尝试在运行时捕获纯函数中的副作用并报告这些副作用作为一个错误。 (a.) 可能对简单的启发式方法有帮助（例如“无条件执行不纯的操作），(b.) 对调试很有用。
不，这似乎是有道理的。希望这个答案也是如此。

回复收藏 0 原文

舟遥客 2024-12-09 03:49:48

当我们可以假设纯度和参考性时，通常的好处就适用
透明度。我们可以自动记住热点。我们可以
自动并行计算。我们可以处理很多事情
比赛条件。我们还可以使用与我们的数据共享结构
知道不能被修改，例如（准）原始“cons()”
不需要复制它正在使用的列表中的 cons-cells。
这些细胞不会受到另一个 cons-cell 的影响
指着它。这个例子有点明显，但是编译器经常
在解决更复杂的结构共享方面表现良好。

然而，实际确定 lambda（函数）是否是纯函数或具有
在 Common Lisp 中，引用透明性非常棘手。请记住
funcall (foo bar) 首先查看 (symbol-function foo)。所以在
这种情况

(defun foo (bar)
  (cons 'zot bar))

foo() 是纯粹的。

下一个 lambda 也是纯的。

(defun quux ()
 (mapcar #'foo '(zong ding flop)))

然而，稍后我们可以重新定义 foo：

(let ((accu -1))
 (defun foo (bar)
   (incf accu)))

下一次对 quux() 的调用不再是纯粹的！旧的纯 foo() 已经
重新定义为不纯的 lambda。哎呀。这个例子也许有点
人为的，但在词汇上重新定义某些内容并不少见
函数，例如使用 let 块。在这种情况下，它不是
可以知道编译时会发生什么。

Common Lisp 具有非常动态的语义，所以实际上
能够提前确定控制流和数据流（例如
编译时的实例）非常困难，并且在大多数有用的情况下
完全无法判定。这是动态语言的典型特征
类型系统。 Lisp 中有很多常见的习语你不能使用
如果您必须使用静态类型。犯规的主要是这些
尝试做很多有意义的静态分析。我们可以为基元做到这一点
就像缺点和朋友一样。但对于涉及其他事情的 lambda 来说
原始我们处于更深的水中，特别是在以下情况下
我们需要考虑功能之间复杂的相互作用。请记住
仅当 lambda 调用的所有 lambda 都是纯的时，它才是纯的。

在我的脑海里，有可能，通过一些深入的宏观学，
消除重新定义问题。从某种意义上说，每个 lambda 得到
一个额外的参数，它是一个代表整个状态的 monad
lisp 图像（我们显然可以限制自己的功能
实际会看）。但这可能更有用
我们自己声明纯度，即我们向编译器承诺
这个 lambda 确实是纯的。如果不这样做的话后果就是
未定义，并且可能会发生各种混乱......

The usual goodies apply when we can assume purity and referential
transparency. We can automatically memoize hotspots. We can
automatically parallelize computation. We can deal away with a lot of
race conditions. We can also use structure sharing with data that we
know cannot be modified, for instance the (quasi) primitive ``cons()''
does not need to copy the cons-cells in the list it's consing to.
These cells are not affected in any way by having another cons-cell
pointing to it. This example is kinda obvious, but compilers are often
good performers in figuring out more complex structure sharing.

However, actually determining if a lambda (a function) is pure or has
referential transparency is very tricky in Common Lisp. Remember that
a funcall (foo bar) start by looking at (symbol-function foo). So in
this case

(defun foo (bar)
  (cons 'zot bar))

foo() is pure.

The next lambda is also pure.

(defun quux ()
 (mapcar #'foo '(zong ding flop)))

However, later on we can redefine foo:

(let ((accu -1))
 (defun foo (bar)
   (incf accu)))

The next call to quux() is no longer pure! The old pure foo() has been
redefined to an impure lambda. Yikes. This example is maybe somewhat
contrived but it's not that uncommon to lexically redefine some
functions, for instance with a let block. In that case it's not
possible to know what would happen at compile time.

Common Lisp has a very dynamic semantic, so actually being
able to determine control flow and data flow ahead of time (for
instance when compiling) is very hard, and in most useful cases
entirely undecidable. This is quite typical of languages with dynamic
type systems. There is a lot of common idioms in Lisp you cannot use
if you must use static typing. It's mainly these that fouls any
attempt to do much meaningful static analysis. We can do it for primitives
like cons and friends. But for lambdas involving other things than
primitives we are in much deeper water, especially in the cases where
we need to look at complex interplay between functions. Remember that
a lambda is only pure if all the lambdas it calls are also pure.

On the top of my head, it could be possible, with some deep macrology,
to do away with the redefinition problem. In a sense, each lambda gets
an extra argument which is a monad that represents the entire state of
the lisp image (we can obviously restrict ourselves to what the function
will actually look at). But it's probably more useful to be able do
declare purity ourselves, in the sense that we promise the compiler
that this lambda is indeed pure. The consequences if it isn't is then
undefined, and all sorts of mayhem could ensue...

回复收藏 0 原文

~没有更多了~