纯函数式编程中是否可能出现副作用
一段时间以来,我一直在尝试了解函数式编程。我查阅了 lambda 演算、LISP、OCaml、F# 甚至组合逻辑,但我遇到的主要问题是 - 如何做需要副作用的事情,例如:
- 与用户交互、
- 与远程服务通信或
- 处理模拟使用随机采样
而不违反纯函数式编程的基本前提,即对于给定的输入,输出是确定性的?
我希望我说得有道理;如果没有,我欢迎任何帮助我理解的尝试。提前致谢。
I have been trying to wrap my head around functional programming for a while now. I have looked up lambda calculus, LISP, OCaml, F# and even combinatorial logic but the main problem I have is this - how do you do things that require side effects like:
- interacting with a user,
- communicating with a remote service, or
- handle simulating using random sampling
without violating the fundamental premise of pure functional programming which is, that for a given input the output is deterministic?
I hope I am making sense; if not I welcome any attempts to help me understand. Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
Haskell 的方式是使用 monad,参见 wikipedia 以及 Haskell 的解释在他们的页面上。
基本上这个想法是你不要摆脱 IO monad。我的理解是,您可以链接打开 IO monad 的函数并执行该函数。但您无法完全删除 IO monad。
另一个使用不直接与 IO 关联的 monad 的例子是 Maybe Monad。与 IO monad 相反,该 monad 是“不可包装的”。但使用 Maybe monad 来解释 monad 的使用会更容易。假设您有以下功能。
现在您可以调用
wrap (Just 4) (5+)
它将返回Just 9
。IO-monad 的想法是您可以在内部类型上使用类似 (+5) 的函数。 monad 将确保函数将被串行调用,因为每个函数都与包装 IO-monad 链接在一起。
The way Haskell does it is by using monads see wikipedia and the explanation by Haskell on their page.
Basically the idea is that you do not get rid of the IO monad. My understanding is that you are able to chain functions that unwrap an IO monad and execute that function. But you are not able to remove the IO monad altogether.
Another example using monads that is not directly tied to IO is the Maybe Monad. This monad is 'unwrappable' in contrary to the IO monad. But it is easier to explain the use of monads using the Maybe monad. Let's assume you have the following function.
now you can call
wrap (Just 4) (5+)
which will returnJust 9
.The idea of the IO-monad is that you can use functions like (+5) on the internal type. The monad will assure that the functions will be called in serial, because every function is chained with the wrapping IO-monad.
鉴于大多数程序都会对外界产生一些影响(写入文件、修改数据库中的数据......),整个程序很少是没有副作用的。除了学术练习之外,尝试是没有意义的。
但程序是由构建块(子例程、函数、方法,随你怎么称呼)组装而成的,而纯函数则可以构成表现良好的构建块。
大多数函数式编程语言不要求函数是纯的,尽管优秀的函数式程序员会尝试使尽可能多的函数变得纯,以便获得引用透明性的好处。
哈斯克尔走得更远。 Haskell 程序的每个部分都是纯粹的(至少没有诸如“unsafePerformIO”之类的错误)。你用 Haskell 编写的所有函数都是纯函数。
副作用是通过单子引入的。它们可以用来引入一种“购物清单——购物者”分离。本质上,您的程序编写一个购物清单(这只是数据,可以以纯粹的方式进行操作),而语言运行时解释购物清单并进行有效的购物。您的所有代码都是纯净且对等式推理等友好的,而不纯净的代码是由编译器编写者提供的。
Given that most programs have some effects on the outside world (writing to files, modifying data in a database...) programs as whole are rarely side-effect free. Outside of academic exercises, there is no point in even trying.
But programs are assembled out of building blocks (subroutine, function, method, call it what you want), and pure functions make for very well-behaved building blocks.
Most functional programming languages do not require functions to be pure, although good functional programmers will try to make as many of their functions pure as is feasible and practical, in order to reap the benefits of referential transparency.
Haskell goes further. Every part of a Haskell Programm is pure (at least in the absence of sins such as "unsafePerformIO"). All functions that you write in Haskell are pure.
Side-effects are introduced through monads. They can be used to introduce a sort of "shopping-list -- shopper"-separation. Essentially your program writes a shopping list (which is just data and can be manipulated in a pure fashion), while the language runtime interprets the shopping list and does the effectful shopping. All your code is pure and friendly to equational reasoning and such, whereas the impure code is provided by the compiler-writers.
即使您在工作中不使用它,学习一种或多种函数式编程语言也是学习以不同方式思考的好方法,并为您提供解决问题的替代方法的工具包(当您无法做到这一点时,它也会让您感到沮丧)就像其他语言中的函数式方法一样整洁干净)。
它使我能够更好地编写 XSL 样式表。
Even if you don't use it in your work, learning one or more functional programming languages is a great way to learn to think differently and gives you a toolkit of alternative approaches to problems (it can also frustrate you when you can't do something as neat and clean as a functional approach in other languages).
And it made me a better at writing XSL stylesheets.
这取决于纯粹的含义......
...这是一个有趣的描述:这里有一个类似的描述:
……然后他对此进行了扩展:
我们将把“纯粹”、“引用透明”和“副作用”等术语的选择留给另一个问题,而不是选择修改此处提出的问题以避免使用它们:
伯顿的解决方案使用了他所谓的“伪数据”:提供抽象的一次性值。然后以合适的结构化值的形式提供初始源 - Burton 使用树:
原始树 - 作为参数传递给正在运行的程序 - 被划分为子树,进行分发(也作为参数到程序的功能)整个程序;
然后从这些子树中检索新的抽象值以供原始函数使用,从而实现预期的效果。
每个抽象值只能使用一次,因此每个基元调用都需要另一个新的抽象值作为输入 - 如果基元调用以某种方式重复,则输出将是相同的。
除了提供不确定性之外,伯顿还简要描述了如何扩展他的方法以访问其他系统资源(特别是时间戳 和空格符)。如需了解更多信息,请阅读他的论文 - 只有 5 页长...
That depends on what is meant by pure...
...that's an interesting description: here's a similar one:
...which he then expands upon:
We'll leave the choice of terms like "pure", "referentially transparent" and "side effects" for another question, instead choosing to modify the question posed here to avoid using them:
Burton's solution uses what he calls pseudo-data: a supply of abstract single-use values. An initial source is then made available in the form of a suitable structured value - Burton uses a tree:
The original tree - passed as an argument to the running program - is divided up into subtrees, to be distributed (also as arguments to the program's functions) throughout the program;
From those subtrees, new abstract values are then retrieved for use by primitive functions, in which the intended effects occur.
Each abstract value can only be used once, so each primitive call requires another new abstract value as input - if a primitive call is somehow duplicated, the output will be the same.
In addition to providing nondeterminism, Burton briefly describes how his approach can be extended to access other system resources (specifically timestamps and spacestamps). For more information, read his paper - it's only 5 pages long...
大多数现实世界的函数式编程在大多数意义上都不是“纯粹的”,所以你的问题的一半答案是“你通过放弃纯粹来做到这一点”。也就是说,还有替代方案。
在纯粹的“最纯粹”意义上,整个程序代表一个或多个参数的单个函数,返回一个值。如果你眯起眼睛并稍微挥挥手,你可以声明所有用户输入都是函数“参数”的一部分,所有输出都是“返回值”的一部分,然后稍微捏造一些东西,这样它就只做“按需”实际 I/O。
类似的观点是声明函数的输入是“外部世界的整个状态”,并且评估函数返回一个新的、修改过的“世界状态”。在这种情况下,程序中使用世界状态的任何函数显然都不再是“确定性的”,因为程序的两个评估不会具有完全相同的外部世界。
如果您想用纯 lambda 演算(或类似的语言,例如深奥的语言 Lazy K)编写交互式程序,从概念上讲,您就是这样做的。
从更实际的角度来说,问题归结为确保当输入用作函数的参数时 I/O 以正确的顺序发生。该问题的“纯”解决方案的一般结构是函数组合。例如,假设您有三个执行 I/O 的函数,并且您希望按特定顺序调用它们。如果您执行类似 RunThreeFunctions(f1, f2, f3) 的操作,则无法确定它们的计算顺序。另一方面,如果您让每个函数采用另一个函数作为参数,你可以像这样链接它们:
f1( f2( f3()))
,在这种情况下,你知道f3
将首先被评估,因为f2 的评估
取决于它的值。 [编辑:另请参阅下面有关惰性评估与热切评估的评论。这很重要,因为惰性求值实际上在非常纯粹的上下文中很常见;例如,纯 lambda 演算中递归的标准实现在急切求值下是不会终止的。]同样,要在 lambda 演算中编写交互式程序,您可能会这样做。如果您想要一些实际可用于编程的东西,您可能希望将函数组合部分与函数的概念结构结合起来,函数获取和返回代表世界状态的值,并创建一些高阶抽象来处理管道化“ I/O 函数之间的“世界状态”值,理想情况下还保留包含的“世界状态”以强制执行严格的线性——此时您几乎重新发明了 Haskell 的
IO
Monad。希望这不会让您更加困惑。
Most real-world functional programming is not "pure" in most senses, so half of the answer to your question is "you do it by giving up on purity". That said, there are alternatives.
In the "purest" sense of pure, the entire program represents a single function of one or more arguments, returning a value. If you squint your eyes and wave your hands a bit, you can declare that all user input is part of the function's "arguments" and that all output is part of the "return value" and then fudge things a bit so that it only does the actual I/O "on demand".
A similar perspective is to declare that the input to the function is "the entire state of the outside world" and that evaluating the function returns a new, modified "state of the world". In that case, any function in the program that uses the world state is obviously freed from being "deterministic" since no two evaluations of the program will have exactly the same outside world.
If you wanted to write an interactive program in the pure lambda calculus (or something equivalent, such as the esoteric language Lazy K), that's conceptually how you'd do it.
In more practical terms, the problem comes down to making sure that I/O occurs in the correct order when input is being used as an argument to a function. The general structure of the "pure" solution to this problem is function composition. For instance, say you have three functions that do I/O and you want to call them in a certain order. If you do something like
RunThreeFunctions(f1, f2, f3)
there's nothing to determine the order they'll be evaluated in. On the other hand, if you let each function take another function as an argument, you can chain them like this:f1( f2( f3()))
, in which case you know thatf3
will be evaluated first because the evaluation off2
depends on its value. [Edit: See also the comment below about lazy vs. eager evaluation. This is important, because lazy evaluation is actually quite common in very pure contexts; e.g., the standard implementation of recursion in the pure lambda calculus is nonterminating under eager evaluation.]Again, to write an interactive program in the lambda calculus, this is how you'd probably do it. If you wanted something actually usable for programming in, you'd probably want to combine the function composition part with the conceptual structure of functions taking and returning values representing the state of the world, and create some higher-order abstraction to handling pipelining the "world state" values between I/O functions, ideally also keeping the "world state" contained in order to enforce strict linearity--at which point you've all but reinvented Haskell's
IO
Monad.Hopefully that didn't just make you even more confused.
Haskell 是一种纯函数式编程语言。在 Haskell 中,所有函数都是纯函数(即它们总是为相同的输入提供相同的输出)。但是如何处理 Haskell 中的副作用呢?好吧,这个问题通过使用 monad 得到了很好的解决。
以 I/O 为例。在 Haskell 中,每个执行 I/O 的函数都会返回 IO 计算,即 IO monad 中的计算。因此,例如,一个从键盘读取 int 的函数,不是返回 int,而是返回一个在运行时产生 int 的 IO 计算:
因为它返回一个 I/O 计算而不是
Int< /code>,例如,您不能直接在求和中使用此结果。为了访问 Int 值,您需要“解开”计算。执行此操作的唯一方法是使用绑定函数 (
>>=
):因为这也会返回 IO 计算,所以最终总是会进行 I/O 计算。这就是 Haskell 隔离副作用的方式。 IO monad 充当现实世界状态的抽象(事实上,在幕后,它通常使用名为
RealWorld
的类型来实现状态部分)。Haskell is a pure functional programming language. In Haskell all functions are pure (i.e. they always give the same output for the same inputs). But how do you handle side-effects in Haskell? Well, this problem is beautifully solved through the use of monads.
Taking I/O as an example. In Haskell every function that does I/O returns an IO computation, i.e. a computation in the IO monad. So, for instance, a function that reads an int from the keyboard, instead of returning an int, returns an IO computation that yields an int when it is run:
Because it returns an I/O computation instead of an
Int
, you cannot use this result directly in a sum, for example. In order to access theInt
value you need to "unwrap" the computation. The only way to do this is to use the bind function (>>=
):Because this also returns an IO computation, you always end up with an I/O computation. This is how Haskell isolates side-effects. The IO monad acts as an abstraction of the state of the real world (in fact under the covers it is usually implemented with a type named
RealWorld
for the state part).与用户交互以及与远程服务通信确实需要软件的某种非功能部分。
许多“函数式语言”(如大多数 Lisp)并不是纯粹的函数式语言。他们仍然允许你做有副作用的事情,尽管在大多数情况下“不鼓励”有副作用的事情。
Haskell 是“纯函数式”,但仍然允许您通过 IO monad 执行非函数式操作。基本思想是,您的纯函数式程序发出一个惰性数据结构,该数据结构由非函数式程序(您不编写,它是环境的一部分)评估。有人可能会说,这种数据结构本身就是一个命令式程序。所以你有点用函数式语言进行命令式元编程。
忽略哪种方法“更好”,这两种情况的目标都是在程序的功能部分和非功能部分之间创建分离,并尽可能限制非功能部分的大小。功能部分往往更可重用、可测试并且更容易推理。
Interacting with a user and communicating with a remote service do require some sort of non-functional part to your software.
Many "functional languages", (like most Lisps) are not purely functional. They still let you do things with side-effects, though side-effecty things are "discouraged" in most contexts.
Haskell is "purely functional" but still lets you do non-functional things via the IO monad. The basic idea is that your purely functional program emits a lazy data structure which is evaluated by a non-functional program (which you don't write, it's part of the environment). One could argue that this data structure itself is an imperative program. So you're sort of doing imperative meta-programming in a functional language.
Ignoring which approach is "better", the goal in both cases is to create a separation between the functional and non-functional parts of your programs, and to limit the size of the non-functional parts as much as possible. The functional parts tend to be more reusable, testable, and easier to reason about.
函数式编程是关于限制和限制的隔离副作用,而不是试图完全消除它们……因为你做不到。
...是的,我发现 FP 很有用(当然对于 Erlang 来说):我发现从“想法”到“程序”(或问题到解决方案)更容易……但当然这可能只是我。
Functional Programming is about limiting & isolating side-effects, not trying to get entirely rid of them... because you can't.
... and yes I find FP useful (certainly with Erlang anyways): I find it is easier to get from "idea" to "program" (or problem to solution ;)... but of course that could just be me.
我所知道的唯一完全纯函数式语言是 C++ 中的模板系统。 Haskell 通过使程序的命令部分变得明确而获得第二名。
在 Haskell 中,程序具有可变状态,但函数(几乎总是)没有。你保持了 99% 的程序纯净,只有与外界交互的部分是不纯净的。因此,当您测试某个功能时,您知道没有副作用。纯净的核心,带有不纯净的外壳。
The only completely pure functional language I know of is the template system in C++. Haskell takes second place by making the imperative portions of the program explicit.
In Haskell the program has mutable state, but functions (almost always) don't. You keep like 99% percent of program pure, and only the portion that interacts with the outside world is impure. Therefore when you are testing a function, you know there are no side effects. Pure core, with an impure shell.
您至少需要了解另一个基本概念:Monad。您将需要它来执行 I/O 和其他“有用”的事情!
You need to know at least another essential concept: Monads. You will need this to do I/O and the other "useful" stuff!