您是否发现仍然需要可以更改的变量?如果是,为什么?

发布于 2024-07-14 10:44:58 字数 3162 浏览 11 评论 0原文

我听到的反对函数式语言的论点之一是单一赋值编码太难了,或者至少比“正常”编程难得多。

但是通过查看我的代码,我意识到,如果您使用相当现代的语言编写,我确实没有很多(任何?)使用模式不能使用单一赋值形式编写。

那么,在其作用域的单次调用中变化的变量有哪些用例呢? 请记住,在调用之间变化的循环索引、参数和其他范围绑定值在这种情况下不是多重赋值(除非您必须在主体中更改它们出于某种原因),并假设您正在编写远远高于汇编语言级别的内容,您可以

values.sum

编写类似or (如果未提供总和)

function collection.sum --> inject(zero, function (v,t) --> t+v )

x = if a > b then a else b

or

n = case s 
  /^\d*$/ : s.to_int
  ''      : 0
  '*'     : a.length
  '?'     : a.length.random
  else    fail "I don't know how many you want"

在需要时 的内容,并具有列表理解,地图/收集等可用。

您是否发现在这样的环境中您仍然想要/需要可变变量?如果是,那是为了什么呢?

澄清一下,我并不是要求复述对 SSA 表格的反对意见,而是要求提供这些反对意见适用的具体例子。 我正在寻找带有可变变量的清晰简洁的代码,如果没有它们就无法编写。

到目前为止我最喜欢的例子(也是我期望的最好的反对意见):

  1. Paul Johnson 的 Fisher-Yates 算法 答案,当您包含大 O 约束时,该算法非常强大。 但是,正如 catulahoops 指出的那样,大 O 问题与 SSA 问题无关,而是与可变数据类型相关,并且抛开这一点,算法可以在 SSA 中相当清晰地编写:

     随机播放(Lst) -> 
           数组:to_list(洗牌(数组:from_list(Lst),erlang:长度(Lst) - 1))。 
       洗牌(数组,0)->   大批; 
       洗牌(数组,N)-> 
           K = 随机:均匀(N) - 1, 
           Ek = 数组:get(K, 数组), 
           En = 数组:get(N, 数组), 
           洗牌(数组:设置(K,En,数组:设置(N,Ek,数组)),N-1)。 
      
  2. jpalecek 的 < a href="https://stackoverflow.com/questions/610956/do-you-find-you-still-need-variable-you-can-change-and-if-so-why/622057#622057">区域多边形的示例:

    def area(figure : List[Point]) : Float = { 
        if(图.空) 返回 0 
        最后的值 = 图(0) 
        var 首先 = 图(0) 
        值 ret = 0 
        for (pt <- 图) { 
          ret+=crossprod(最后-第一个,pt-第一个) 
          最后=点 
        } 
        雷特 
      } 
      

    可能仍然会写成这样:

    def area(figure : List[Point]) : Float = { 
          如果图.length <   3 
              0 
            别的 
              var a = 图(0) 
              var b = 图(1) 
              var c = 图(2) 
              如果图.length == 3 
                  幅度(叉积(ba,ca)) 
                别的  
                  FoldLeft((0,a,b))(图.rest)) {  
                     ((t,a,b),c) =>   (t+面积([a,b,c]),a,c) 
                     } 
      

    或者,由于有些人反对这个公式的密度,因此可以重新设计:

    def area([]) = 0.0 # 空图形没有面积 
      def area([_]) = 0.0 # ...点也没有 
      def area([_,_]) = 0.0 # ...或线段 
      def area([a,b,c]) = # 可以直接求三角形的面积 
          幅度(叉积(ba,ca)) 
      def area(figure) = # 对于较大的图形,减少为三角形并求和 
          as_triangles(图).collect(面积).sum 
    
      def as_triangles([]) = [] # 没有至少三个点的三角形 
      def as_triangles([_]) = [] 
      def as_triangles([_,_]) = [] 
      def as_triangles([a,b,c | 其余) = [[a,b,c] |   as_triangles([a,c | 休息])] 
      
  3. 公主的观点关于使用不可变结构实现 O(1) 队列的难度很有趣(并且很可能为令人信服的示例提供基础),但如上所述,它从根本上与数据结构的可变性有关,而不是直接与多重分配问题有关。< /p>

  4. 我对埃拉托斯特尼筛法的答案很感兴趣,但并不相信。 他引用的论文中给出的正确的大O,拉出尽可能多的素数生成器,无论有没有SSA,看起来都不容易正确实现。


嗯,谢谢大家的尝试。 由于大多数答案要么是 1)基于可变数据结构,而不是基于单一赋值,2)在某种程度上它们是关于单一赋值形式,很容易被本领域技术人员反驳,我将从我的演讲和/或重组中划出一条线(也许可以将其作为讨论主题备份,以防万一我在时间用完之前用完词)。

再次感谢。

One of the arguments I've heard against functional languages is that single assignment coding is too hard, or at least significantly harder than "normal" programming.

But looking through my code, I realized that I really don't have many (any?) use patterns that can't be written just as well using single assignment form if you're writing in a reasonably modern language.

So what are the use cases for variables that vary within a single invocation of their scope? Bearing in mind that loop indexes, parameters, and other scope bound values that vary between invocations aren't multiple assignments in this case (unless you have to change them in the body for some reason), and assuming that you are writing in something a far enough above the assembly language level, where you can write things like

values.sum

or (in case sum isn't provided)

function collection.sum --> inject(zero, function (v,t) --> t+v )

and

x = if a > b then a else b

or

n = case s 
  /^\d*$/ : s.to_int
  ''      : 0
  '*'     : a.length
  '?'     : a.length.random
  else    fail "I don't know how many you want"

when you need to, and have list comprehensions, map/collect, and so forth available.

Do you find that you still want/need mutable variables in such an environment, and if so, what for?

To clarify, I'm not asking for a recitation of the objections to SSA form, but rather concrete examples where those objections would apply. I'm looking for bits of code that are clear and concise with mutable variables and couldn't be written so without them.

My favorite examples so far (and the best objection I expect to them):

  1. Paul Johnson's Fisher-Yates algorithm answer, which is pretty strong when you include the big-O constraints. But then, as catulahoops points out, the big-O issue isn't tied to the SSA question, but rather to having mutable data types, and with that set aside the algorithm can be written rather clearly in SSA:

     shuffle(Lst) ->
         array:to_list(shuffle(array:from_list(Lst), erlang:length(Lst) - 1)).
     shuffle(Array, 0) -> Array;
     shuffle(Array, N) ->
         K = random:uniform(N) - 1,
         Ek = array:get(K, Array),
         En = array:get(N, Array),
         shuffle(array:set(K, En, array:set(N, Ek, Array)), N-1).
    
  2. jpalecek's area of a polygon example:

    def area(figure : List[Point]) : Float = {
      if(figure.empty) return 0
      val last = figure(0)
      var first= figure(0)
      val ret = 0
      for (pt <- figure) {
        ret+=crossprod(last - first, pt - first)
        last = pt
      }
      ret
    }
    

    which might still be written something like:

    def area(figure : List[Point]) : Float = {
        if figure.length < 3
            0
          else
            var a = figure(0)
            var b = figure(1)
            var c = figure(2)
            if figure.length == 3
                magnitude(crossproduct(b-a,c-a))
              else 
                foldLeft((0,a,b))(figure.rest)) { 
                   ((t,a,b),c) => (t+area([a,b,c]),a,c)
                   }
    

    Or, since some people object to the density of this formulation, it could be recast:

    def area([])    = 0.0   # An empty figure has no area
    def area([_])   = 0.0   # ...nor does a point
    def area([_,_]) = 0.0   # ...or a line segment
    def area([a,b,c]) =     # The area of a triangle can be found directly
        magnitude(crossproduct(b-a,c-a))
    def area(figure) =      # For larger figures, reduce to triangles and sum
        as_triangles(figure).collect(area).sum
    
    def as_triangles([])      = []  # No triangles without at least three points
    def as_triangles([_])     = []
    def as_triangles([_,_])   = []
    def as_triangles([a,b,c | rest) = [[a,b,c] | as_triangles([a,c | rest])]
    
  3. Princess's point about the difficulty of implementing O(1) queues with immutable structures is interesting (and may well provide the basis for a compelling example) but as stated it's fundamentally about the mutability of the data structure, and not directly about the multiple assignment issue.

  4. I'm intrigued by the Sieve of Eratosthenes answer, but unconvinced. The proper big-O, pull as many primes as you'd like generator given in the paper he cited does not look easy to implement correctly with or without SSA.


Well, thanks everyone for trying. As most of the answers turned out to be either 1) based on mutable data structures, not on single-assignment, and 2) to the extent they were about single assignment form easily countered by practitioners skilled in the art, I'm going to strike the line from my talk and / or restructure (maybe have it in backup as a discussion topic in the unlikely event I run out of words before I run out of time).

Thanks again.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

迷爱 2024-07-21 10:44:59

感谢丘奇图灵论,我们知道任何可以用图灵完备语言编写的东西都可以用任何图灵完备语言编写。 所以,当你认真对待它时,如果你足够努力的话,在 Lisp 中没有你做不到的事情,在 C# 中也做不到,反之亦然。 (更重要的是,在大多数情况下,任何一种都会被编译为 x86 机器语言。)

因此,您问题的答案是:不存在这种情况。 所有的案例都更容易被人类用一种范式/语言或另一种方式理解——而这里的理解难易程度与培训和经验有关。

Thanks to the Church-Turing Thesis, we know that anything that can be written in a Turing-complete language can be written in any Turing-complete language. So, when you get right down to it, there's nothing you can't do in Lisp that you couldn't do in C#, if you tried hard enough, or vice versa. (More to the point, either one is going to get compiled down to x86 machine language in most cases anyway.)

So, the answer to your question is: there are no such cases. All there are are cases that are easier for humans to comprehend in one paradigm/language or another-- and the ease of comprehension here is tied to training and experience.

吃不饱 2024-07-21 10:44:59

也许这里的主要问题是语言中循环的风格。 在使用递归的语言中,当再次调用函数时,循环过程中更改的任何值都会重新绑定。 在块中使用迭代器的语言(例如,Smalltalk 和 Ruby 的 inject 方法)往往很相似,尽管许多使用 Ruby 的人仍然会使用 each上的可变变量注入

另一方面,当您使用 whilefor 编写循环代码时,当您可以传入多个参数时,您无法轻松地重新绑定自动出现的变量到执行循环一次迭代的代码块,因此不可变变量会更加不方便。

使用 Haskell 确实是研究可变变量必要性的好方法,因为默认值是不可变的,但可变变量是可用的(如 IORefs、MVars 等) 。 最近我自己也在这样呃“调查”,得出以下结论。

  1. 在绝大多数情况下,可变变量不是必需的,没有它们我也很高兴。

    在绝大多数情况下

  2. 对于线程间通信,可变变量是必不可少的,原因相当明显。 (这是 Haskell 特有的;当然,在最低级别使用消息传递的运行时系统不需要它们。)但是,这种使用非常罕见,必须使用函数来读取和写入它们 (readIORef fooRef val 等)并不是太大的负担。

  3. 我在单个线程中使用了可变变量,因为它似乎使某些事情变得更容易,但后来我后悔了,因为我意识到很难推断存储在那里的值发生了什么。 (几个不同的函数正在操纵该值。)这有点令人大开眼界; 在典型的温水煮青蛙风格中,我没有意识到 Haskell 让我能够如此轻松地推理值的使用,直到我遇到了一个关于我如何使用它们的示例.

所以这些天我相当坚定地站在不可变变量一边。

由于之前对这个问题的回答混淆了这些事情,我觉得有必要在这里非常有力地指出,这个问题与纯粹性和函数式编程都是正交的。 例如,我认为 Ruby 会受益于单赋值局部变量,尽管可能需要对该语言进行一些其他更改(例如添加尾递归)才能真正方便。

Perhaps the main issue here is the style of looping in a language. In langauges where we use recursion, any values changing over the course of a loop are re-bound when the function is called again. Languages using iterators in blocks (e.g., Smalltalk's and Ruby's inject method) tend to be similar, though many people in Ruby would still use each and a mutable variable over inject.

When you code loops using while and for, on the other hand, you don't have the easy re-binding of variables that comes automatically when you can pass in several parameters to your chunk of code that does one iteration of the loop, so immutable variables would be rather more inconvenient.

Working in Haskell is a really good way to investigate the necessity of mutable variables, since the default is immutable but mutable ones are available (as IORefs, MVars, and so on). I've been recently, er, "investigating" in this way myself, and have come to the following conclusions.

  1. In the vast majority of cases, mutable variables are not necessary, and I'm happy living without them.

  2. For inter-thread communication, mutable variables are essential, for fairly obvious reasons. (This is specific to Haskell; runtime systems that use message passing at the lowest level don't need them, of course.) However, this use is rare enough that having to use functions to read and write them (readIORef fooRef val etc.) is not much of a burden.

  3. I have used mutable variables within a single thread, because it seemed to make certain things easier, but later regretted it as I realized that it became very hard to reason about what was happening to the value stored there. (Several different functions were manipulating that value.) This was a bit of an eye-opener; in typical frog-in-the-pot-of-warming-water style, I'd not realized how easy Haskell had made it for me to reason about the use of values until I ran into an example of how I used to use them.

So these days I've come down fairly firmly on the side of immutable variables.

Since previous answers to this question have confused these things, I feel compelled to point out here quite forcefully that this issue is orthogonal to both purity and functional programming. I feel that Ruby, for example, would benefit from having single-assignment local variables, though possibly a few other changes to the language, such as adding tail recursion, would be necessary to make this truly convenient.

横笛休吹塞上声 2024-07-21 10:44:59

当您需要对大型数据结构进行小的更改时该怎么办? 您并不真的希望每次修改一些元素时都复制整个数组或大型类。

What about when you need to make small changes to large data structures? You don't really want to copy a whole array or large class every time you would modify a few elements.

再可℃爱ぅ一点好了 2024-07-21 10:44:59

除了现在你指出之外,我还没有真正考虑过这一点。

事实上,我尽量不下意识地使用多项作业。

这是我正在谈论的示例,用 python

start = self.offset%n
if start:
    start = n-start

编写,以避免不必要的额外模数或减法。 这与 bignum 风格的长整型一起使用,因此它是一个值得优化的。 但问题是,它确实是一个单一的任务。

我根本不会错过多项作业。

I haven't really thought about this much except now that you're pointing it out.

Actually I try not using multiple assignments subconsciously.

Here's an example of what Im talking about, in python

start = self.offset%n
if start:
    start = n-start

Written this way to avoid an unneccesary extra Modulo or subtraction. This is used with bignum style long ints, so its a worthwhile optimization. Thing about it, though, is that it really is a single assignment.

I wouldn't miss multiple assignment at all.

冧九 2024-07-21 10:44:59

我知道您要求提供确实显示可变变量好处的代码。 我希望我能提供它。 但正如之前指出的,没有什么问题不能用两种方式表达。 特别是因为你指出 jpalecek 的多边形示例区域可以用折叠算法来编写(恕我直言,这是更混乱的方式,并将问题带到不同的复杂程度) - 这让我想知道为什么你会降低可变性,所以难的。 因此,我将尝试论证不可变数据和可变数据的共同点和共存。

在我看来,这个问题有点没有抓住重点。 我知道我们程序员倾向于喜欢干净和简单的东西,但有时我们会忽略混合也是可能的。 这可能就是为什么在关于不变性的讨论中很少有人采取中间立场。 我只是想知道为什么,因为让我们面对现实 - 不变性是抽象各种问题的一个很好的工具。 但有时这是一个巨大的痛苦。 有时它实在是太受限制了。 仅此一点就让我停下来思考——我们真的想放弃可变性吗? 真的是非此即彼吗? 我们是否可以达成一些共同点? 不变性什么时候可以帮助我更快地实现目标,可变性什么时候可以帮助我更快地实现目标? 哪种解决方案更容易阅读和维护?(这对我来说是最大的问题)

很多这些问题都受到程序员的品味以及他们用来编程的内容的影响。所以我会重点关注通常是大多数支持不变性论点的中心的方面之一 - 并行性:

并行性通常被投入到围绕不变性的论点中。 我曾经处理过需要 100 多个 CPU 在合理时间内解决的问题集。 它教会了我一件非常重要的事情:大多数时候,并行化数据图的操作实际上并不是最有效的并行化方式。 它确实可以带来很大好处,但不平衡是该问题空间中的一个真正问题。 因此,通常并行处理多个可变图并与不可变消息交换信息会更有效。 这意味着,当我知道该图是孤立的,并且我没有向外界透露它时,我想以我能想到的最简洁的方式对其执行操作。 这通常涉及改变数据。 但是在对数据进行这些操作之后,我想向全世界开放数据 - 如果数据是可变的,这就是我通常会有点紧张的地方。 因为程序的其他部分可能会损坏数据,状态就会变得无效,……因为在向世界开放之后,数据通常会进入并行世界。

因此,现实世界的并行程序通常具有在确定的单线程操作中使用数据图的区域(因为外部根本不知道它们)以及可能涉及多线程操作的区域(希望只是提供不被操作的数据) 。 在这些多线程部分中,我们从不希望它们发生更改 - 处理过时的数据比处理不一致的数据更好。 这可以通过不变性的概念来保​​证。

这让我得出了一个简单的结论:对我来说真正的问题是我熟悉的编程语言都不允许我说:“在这一点之后,整个数据结构将是不可变的”并且“在这里给我这个不可变数据结构的可变副本,请验证只有我可以看到该可变副本”。 现在我必须自己通过翻转只读位或类似的东西来保证这一点。 如果我们能让编译器支持它,它不仅可以保证我在翻转所述位后没有做任何愚蠢的事情,而且它实际上可以帮助编译器进行以前无法做到的各种优化。 另外,对于更熟悉命令式编程模型的程序员来说,该语言仍然具有吸引力。

所以总结一下。 恕我直言,程序通常有充分的理由使用数据图的不可变和可变表示形式。 我认为数据应该默认情况下是不可变的,并且编译器应该强制执行这一点 - 但我们应该有私有可变表示的概念,因为自然存在多方面的领域线程永远不会达到 - 可读性和可维护性可以从更命令式的结构中受益。

I know you asked for code that did show the benefits of mutable variables. And I wish I could provide it. But as pointed out before - there is no problem that can't be expressed in both fashions. And especially since you pointed out that jpalecek's area of a polygon example could be written with a folding algo (which is IMHO way messier and takes the problem to different level of complexity) - well it made me wonder why you are coming down on mutability so hard. So I'll try to make the argument for a common ground and an coexistence of immutable and mutable data.

In my opinion this question misses the point a bit. I know that us programmers are prone to liking things clean and simple but we sometimes miss that a mixture is possible as well. And that's probably why in the discussion about immutability there is seldom somebody taking the middle ground. I just wonder why, because let's face it - immutability is a great tool of abstracting all kinds of problems. But sometimes it is a huge pain in the ass. Sometimes it simply is too constraining. And that alone makes me stop and thing - do we really want to loose mutability? Is it really either-or? Isn't there some common ground we can arrive at? When does immutability help me achieve my goals faster, when does mutability? Which solution is easier to read and maintain? (Which for me is the biggest question)

A lot of these questions are influenced by a programmer's taste and by what they are used to program in. So I'll focus on one of the aspects that is usually the center of most pro-immutability arguments - Parallelism:

Often parallelism is thrown into the argument surrounding immutability. I've worked on problem sets that required 100+ CPUs to solve in a reasonable time. And it has taught me one very important thing: Most of the time parallelizing the manipulation of graphs of data is really not the kind of thing that will be the most efficient way to parallelize. It sure can benefit greatly, but imbalance is a real problem in that problem-space. So usually working on multiple mutable graphs in parallel and exchanging information with immutable messages is way more efficient. Which means, when I know that the graph is isolated, that I have not revealed it to the outside world, I would like to perform my operations on it in the most concise manner I can think of. And that usually involves mutating the data. But after these operation on the data I want to open the data up to the whole world - and that's the point where I usually get a bit nervous, if the data is mutable. Because other parts of the program could corrupt the data, the state becomes invalid, ... because after opening up to the world the data often does get into the world of parallelism.

So real world parallel programs usually have areas where data graphs are used in definitive single thread operations - because they simply are not known to the outside - and areas where they could be involved in multi-threaded operations (hopefully just supplying data not being manipulated). During those multi-threaded parts we never want them to change - it simply is better to work on outdated data than on inconsistent data. Which can be guaranteed by the notion of immutability.

That made me come to a simple conclusion: The real problem for me is that non of the programming languages I am familiar with allow me to say: "After this point this whole data structure shal be immutable" and "give me a mutable copy of this immutable data structure here, please verify that only I can see the mutable copy". Right now I have to guarantee it myself by flipping a readonly bit or something similar. If we could have compiler support for it, not only would it guarantee for me that I did not do anything stupid after flipping said bit, but it could actually help the compiler do various optimizations that it couldn't do before. Plus - the language would still be attractive for programmers that are more familiar with the imperative programming model.

So to sum up. IMHO programs usually have a good reason to use both immutable and mutable representations of data graphs. I would argue that data should be immutable by default and the compiler should enforce that - but we should have the notion of private mutable representations, because there naturally are areas where multi-threading will never reach - and readability and maintainability could benefit from a more imperative structuring.

流年里的时光 2024-07-21 10:44:58

我遇到的最难的问题是整理列表。 Fisher-Yates 算法(有时也称为 Knuth 算法)涉及迭代列表将每个项目与随机的其他项目交换。 该算法的复杂度为 O(n),众所周知且早已被证明是正确的(在某些应用中是一个重要的属性)。 但它需要可变数组。

这并不是说您不能在函数式程序中进行洗牌。 Oleg Kiselyov 对此进行了描述。 但如果我理解正确的话,函数混洗是 O(n . log n) 因为它是通过构建二叉树来工作的。

当然,如果我需要在 Haskell 中编写 Fisher-Yates 算法,我只需将其放入 ST monad,它可以让您封装涉及 可变数组 在一个漂亮的纯函数中,如下所示:

-- | Implementation of the random swap algorithm for shuffling.  Reads a list
-- into a mutable ST array, shuffles it in place, and reads out the result
-- as a list.

module Data.Shuffle (shuffle) where


import Control.Monad
import Control.Monad.ST
import Data.Array.ST
import Data.STRef
import System.Random

-- | Shuffle a value based on a random seed.
shuffle :: (RandomGen g) => g -> [a] -> [a]
shuffle _ [] = []
shuffle g xs = 
    runST $ do
      sg <- newSTRef g
      let n = length xs
      v <- newListArray (1, n) xs
      mapM_ (shuffle1 sg v) [1..n]
      getElems v

-- Internal function to swap element i with a random element at or above it.
shuffle1 :: (RandomGen g) => STRef s g -> STArray s Int a -> Int -> ST s ()
shuffle1 sg v i = do
  (_, n) <- getBounds v
  r <- getRnd sg $ randomR (i, n)
  when (r /= i) $ do
    vi <- readArray v i
    vr <- readArray v r
    writeArray v i vr
    writeArray v r vi


-- Internal function for using random numbers
getRnd :: (RandomGen g) => STRef s g -> (g -> (a, g)) -> ST s a
getRnd sg f = do
  g1 <- readSTRef sg
  let (v, g2) = f g1
  writeSTRef sg g2
  return v

The hardest problem I've come across is shuffling a list. The Fisher-Yates algorithm (also sometimes known as the Knuth algorithm) involves iterating through the list swapping each item with a random other item. The algorithm is O(n), well known and long-since proven correct (an important property in some applications). But it requires mutable arrays.

That isn't to say you can't do shuffling in a functional program. Oleg Kiselyov has written about this. But if I understand him correctly, functional shuffling is O(n . log n) because it works by building a binary tree.

Of course, if I needed to write the Fisher-Yates algorithm in Haskell I'd just put it in the ST monad, which lets you wrap up an algorithm involving mutable arrays inside a nice pure function, like this:

-- | Implementation of the random swap algorithm for shuffling.  Reads a list
-- into a mutable ST array, shuffles it in place, and reads out the result
-- as a list.

module Data.Shuffle (shuffle) where


import Control.Monad
import Control.Monad.ST
import Data.Array.ST
import Data.STRef
import System.Random

-- | Shuffle a value based on a random seed.
shuffle :: (RandomGen g) => g -> [a] -> [a]
shuffle _ [] = []
shuffle g xs = 
    runST $ do
      sg <- newSTRef g
      let n = length xs
      v <- newListArray (1, n) xs
      mapM_ (shuffle1 sg v) [1..n]
      getElems v

-- Internal function to swap element i with a random element at or above it.
shuffle1 :: (RandomGen g) => STRef s g -> STArray s Int a -> Int -> ST s ()
shuffle1 sg v i = do
  (_, n) <- getBounds v
  r <- getRnd sg $ randomR (i, n)
  when (r /= i) $ do
    vi <- readArray v i
    vr <- readArray v r
    writeArray v i vr
    writeArray v r vi


-- Internal function for using random numbers
getRnd :: (RandomGen g) => STRef s g -> (g -> (a, g)) -> ST s a
getRnd sg f = do
  g1 <- readSTRef sg
  let (v, g2) = f g1
  writeSTRef sg g2
  return v
浊酒尽余欢 2024-07-21 10:44:58

如果你想进行学术论证,那么从技术上来说当然没有必要多次分配一个变量。 证明是所有代码都可以以 SSA(单一静态分配) 形式表示。 事实上,这对于多种静态和动态分析来说是最有用的形式。

同时,我们一开始并不都以 SSA 形式编写代码是有原因的:

  1. 以这种方式编写代码通常需要更多语句(或更多行代码)。 简洁有价值。
  2. 它几乎总是效率较低。 是的,我知道您正在谈论高级语言——一个公平的范围——但即使在 Java 和 C# 的世界中,远离汇编,速度也很重要。 很少有应用程序与速度无关。
  3. 这并不容易理解。 尽管 SSA 在数学意义上“更简单”,但它对常识更加抽象,而常识在现实世界的编程中才是重要的。 如果你必须非常聪明才能理解它,那么它在整个编程中没有地位。

即使在上面的例子中,也很容易找出漏洞。 获取您的 case 语句。 如果有一个管理选项来确定是否允许 '*' ,并有一个单独的选项来确定是否允许 '?' ,该怎么办? 此外,整数情况下不允许为零,除非用户具有允许的系统权限。

这是一个带有分支和条件的更真实的示例。 你能把它写成一个单一的“声明”吗? 如果是这样,您的“声明”真的与许多单独的声明不同吗? 如果没有,您需要多少个临时只写变量? 这种情况比只有一个变量要好得多吗?

If you want to make the academic argument, then of course it's not technically necessary to assign a variable more than once. The proof is that all code can be represented in SSA (Single Static Assignment) form. Indeed, that's the most useful form for many kinds of static and dynamic analysis.

At the same time, there are reasons we don't all write code in SSA form to begin with:

  1. It usually takes more statements (or more lines of code) to write code this way. Brevity has value.
  2. It's almost always less efficient. Yes I know you're talking about higher languages -- a fair scoping -- but even in the world of Java and C#, far away from assembly, speed matters. There are few applications where speed is irrelevant.
  3. It's not as easy to understand. Although SSA is "simpler" in a mathematical sense, it's more abstract from common sense, which is what matters in real-world programming. If you have to be really smart to grok it, then it has no place in programming at large.

Even in your examples above, it's easy to poke holes. Take your case statement. What if there's an administrative option that determines whether '*' is allowed, and a separate one for whether '?' is allowed? Also, zero is not allowed for the integer case, unless the user has a system permission that allows it.

This is a more real-world example with branches and conditions. Could you write this as a single "statement?" If so, is your "statement" really different from many separate statements? If not, how many temporary write-only variables do you need? And is that situation significantly better than just having a single variable?

千年*琉璃梦 2024-07-21 10:44:58

我从来没有发现过这样的情况。 虽然您总是可以发明新名称(例如转换为 SSA 形式),但我实际上发现每个值拥有自己的名称是简单而自然的。 像 Haskell 这样的语言为我提供了很多关于命名哪些值的选择,以及放置名称绑定的两个不同位置(letwhere)。 我发现单一作业形式非常自然,一点也不困难。

我偶尔会怀念能够拥有指向堆上可变对象的指针。 但这些东西没有名字,所以这不是同一个反对意见。 (而且我还发现,当我在堆上使用可变对象时,我倾向于编写更多错误!)

I've never identified such a case. And while you can always just invent new names, as in conversion to SSA form, I actually find it's easy and natural for each value to have its own name. A language like Haskell gives me a lot of choices about which values to name, and two different places to put name bindings (let and where). I find the single-assignment form quite natural and not at all difficult.

I do occasionally miss being able to have pointers to mutable objects on the heap. But these things have no names, so it's not the same objection. (And I also find that when I use mutable objects on the heap, I tend to write more bugs!)

最偏执的依靠 2024-07-21 10:44:58

我认为您会发现最高效的语言允许您混合函数式和命令式风格,例如 OCaml 和 F#。

在大多数情况下,我可以编写一长串“将 x 映射到 y,将 y 减少到 z”的代码。 在 95% 的情况下,函数式编程简化了我的代码,但不变性在一个领域发挥了作用:

实现的难易程度以及不可变堆栈和不可变队列之间的巨大差异。

堆栈既简单又简单与持久性很好地结合在一起,队列是荒谬的。

不可变队列的常见实现使用一个或多个内部堆栈和堆栈循环。 好处是这些队列大部分时间的运行时间为 O(1),但某些操作的运行时间为 O(n)。 如果您依赖于应用程序中的持久性,那么原则上每个操作的运行时间都是 O(n)。 当您需要实时(或至少一致)性能时,这些队列并不适用。

Chris Okasaki 在他的书中提供了不可变队列的实现,他们使用惰性来实现所有操作均为 O(1)。 它是一个非常聪明、相当简洁的实时队列实现——但它需要深入了解其底层实现细节,而且它仍然比不可变堆栈复杂一个数量级。

相比之下,我可以使用可变链表编写堆栈和队列,这些链表在所有操作中都以恒定的时间运行,并且生成的代码将非常简单。


关于多边形的面积,很容易将其转换为函数形式。 假设我们有一个像这样的 Vector 模块:

module Vector =
    type point =
        { x : float; y : float}
        with
            static member ( + ) ((p1 : point), (p2 : point)) =
                { x = p1.x + p2.x;
                  y = p1.y + p2.y;}

            static member ( * ) ((p : point), (scalar : float)) =
                { x = p.x * scalar;
                  y = p.y * scalar;}

            static member ( - ) ((p1 : point), (p2 : point)) = 
                { x = p1.x - p2.x;
                  y = p1.y - p2.y;}

    let empty = { x = 0.; y = 0.;}
    let to_tuple2 (p : point) = (p.x, p.y)
    let from_tuple2 (x, y) = { x = x; y = y;}
    let crossproduct (p1 : point) (p2 : point) =
        { x = p1.x * p2.y; y = -p1.y * p2.x }

我们可以使用一点元组魔法来定义我们的面积函数:

let area (figure : point list) =
    figure
    |> Seq.map to_tuple2
    |> Seq.fold
        (fun (sum, (a, b)) (c, d) -> (sum + a*d - b*c, (c, d) ) )
        (0., to_tuple2 (List.hd figure))
    |> fun (sum, _) -> abs(sum) / 2.0

或者我们可以使用叉积来代替,

let area2 (figure : point list) =
    figure
    |> Seq.fold
        (fun (acc, prev) cur -> (acc + (crossproduct prev cur), cur))
        (empty, List.hd figure)
    |> fun (acc, _) -> abs(acc.x + acc.y) / 2.0

我发现这两个函数都不可读。

I think you'll find the most productive languages allow you to mix functional and imperative styles, such as OCaml and F#.

In most cases, I can write code which is simply a long line of "map x to y, reduce y to z". In 95% of cases, functional programming simplifies my code, but there is one area where immutability shows its teeth:

The wide disparity between the ease of implementing and immutable stack and an immutable queue.

Stacks are easy and mesh well with persistence, queues are ridiculous.

The most common implementations of immutable queues use one or more internal stacks and stack rotations. The upside is that these queues run in O(1) most of the time, but some operations will run in O(n). If you're relying on persistence in your application, then its possible in principle that every operation runs in O(n). These queues are no good when you need realtime (or at least consistent) performance.

Chris Okasaki's provides an implementation of immutable queues in his book, they use laziness to achieve O(1) for all operations. Its a very clever, reasonably concise implementation of a realtime queue -- but it requires deep understanding of its underlying implementation details, and its still an order of magnitude more complex than an immutable stack.

In constrast, I can write a stack and queue using mutable linked lists which run in constant time for all operations, and the resulting code would be very straightforward.


Regarding the area of a polygon, its easy to convert it to functional form. Let's assume we have a Vector module like this:

module Vector =
    type point =
        { x : float; y : float}
        with
            static member ( + ) ((p1 : point), (p2 : point)) =
                { x = p1.x + p2.x;
                  y = p1.y + p2.y;}

            static member ( * ) ((p : point), (scalar : float)) =
                { x = p.x * scalar;
                  y = p.y * scalar;}

            static member ( - ) ((p1 : point), (p2 : point)) = 
                { x = p1.x - p2.x;
                  y = p1.y - p2.y;}

    let empty = { x = 0.; y = 0.;}
    let to_tuple2 (p : point) = (p.x, p.y)
    let from_tuple2 (x, y) = { x = x; y = y;}
    let crossproduct (p1 : point) (p2 : point) =
        { x = p1.x * p2.y; y = -p1.y * p2.x }

We can define our area function using a little bit of tuple magic:

let area (figure : point list) =
    figure
    |> Seq.map to_tuple2
    |> Seq.fold
        (fun (sum, (a, b)) (c, d) -> (sum + a*d - b*c, (c, d) ) )
        (0., to_tuple2 (List.hd figure))
    |> fun (sum, _) -> abs(sum) / 2.0

Or we can use the cross product instead

let area2 (figure : point list) =
    figure
    |> Seq.fold
        (fun (acc, prev) cur -> (acc + (crossproduct prev cur), cur))
        (empty, List.hd figure)
    |> fun (acc, _) -> abs(acc.x + acc.y) / 2.0

I don't find either function unreadable.

晒暮凉 2024-07-21 10:44:58

使用单个赋值来实现该洗牌算法很简单,事实上,它与将迭代重写为尾递归的命令式解决方案完全相同。 (Erlang 因为我可以比 Haskell 更快地编写它。)

 shuffle(Lst) ->
     array:to_list(shuffle(array:from_list(Lst), erlang:length(Lst) - 1)).

 shuffle(Array, 0) -> Array;
 shuffle(Array, N) ->
     K = random:uniform(N) - 1,
     Ek = array:get(K, Array),
     En = array:get(N, Array),
     shuffle(array:set(K, En, array:set(N, Ek, Array)), N-1).

如果这些数组操作的效率是一个问题,那么这是一个关于可变数据结构的问题,与单个赋值无关。

您不会得到这个问题的答案,因为不存在任何示例。 这只是对这种风格熟悉程度的问题。

That shuffle algorithm is trivial to implement using single assignment, in fact it's exactly the same as the imperative solution with the iteration rewritten to tail recursion. (Erlang because I can write it more quickly than Haskell.)

 shuffle(Lst) ->
     array:to_list(shuffle(array:from_list(Lst), erlang:length(Lst) - 1)).

 shuffle(Array, 0) -> Array;
 shuffle(Array, N) ->
     K = random:uniform(N) - 1,
     Ek = array:get(K, Array),
     En = array:get(N, Array),
     shuffle(array:set(K, En, array:set(N, Ek, Array)), N-1).

If the efficiency of those array operations is a concern, then that's a question about mutable data structures and has nothing to do with single assignment.

You won't get an answer to this question because no examples exist. It is only a question of familiarity with this style.

梓梦 2024-07-21 10:44:58

回应杰森——

function forbidden_input?(s)
    (s = '?' and not administration.qmark_ok) ||
    (s = '*' and not administration.stat_ok)  ||
    (s = '0' and not 'root node visible' in system.permissions_for(current_user))

n = if forbidden_input?(s)
    fail "'" + s + "' is not allowed."
  else
    case s
      /^\d*$/ : s.to_int
      ''      : 0
      '*'     : a.length
      '?'     : a.length.random
      else    fail "I don't know how many you want"

In response to Jason --

function forbidden_input?(s)
    (s = '?' and not administration.qmark_ok) ||
    (s = '*' and not administration.stat_ok)  ||
    (s = '0' and not 'root node visible' in system.permissions_for(current_user))

n = if forbidden_input?(s)
    fail "'" + s + "' is not allowed."
  else
    case s
      /^\d*$/ : s.to_int
      ''      : 0
      '*'     : a.length
      '?'     : a.length.random
      else    fail "I don't know how many you want"
挽清梦 2024-07-21 10:44:58

我会错过非纯函数式语言的作业。 主要是因为它们阻碍了循环的实用性。 示例 (Scala):

def quant[A](x : List[A], q : A) = {
  var tmp : A=0
  for (el <- x) { tmp+= el; if(tmp > q) return el; }
  // throw exception here, there is no prefix of the list with sum > q
}

这应该计算列表的分位数,注意累加器 tmp 被分配了多次。

一个类似的例子是:

def area(figure : List[Point]) : Float = {
  if(figure.empty) return 0
  val last = figure(0)
  var first= figure(0)
  val ret = 0
  for (pt <- figure) {
    ret+=crossprod(last - first, pt - first)
    last = pt
  }
  ret
}

主要注意 last 变量。

这些示例可以使用元组上的折叠来重写,以避免多次分配,但这实际上对可读性没有帮助。

I would miss assignments in a non-purely functional language. Mostly because they hinder the usefulness of loops. Examples (Scala):

def quant[A](x : List[A], q : A) = {
  var tmp : A=0
  for (el <- x) { tmp+= el; if(tmp > q) return el; }
  // throw exception here, there is no prefix of the list with sum > q
}

This should compute the quantile of a list, note the accumulator tmp which is assigned to multiple times.

A similar example would be:

def area(figure : List[Point]) : Float = {
  if(figure.empty) return 0
  val last = figure(0)
  var first= figure(0)
  val ret = 0
  for (pt <- figure) {
    ret+=crossprod(last - first, pt - first)
    last = pt
  }
  ret
}

Note mostly the last variable.

These examples could be rewritten using fold on a tuple to avoid multiple assignments, but that would really not help the readability.

初雪 2024-07-21 10:44:58

局部(方法)变量当然永远不会被分配两次。 但即使在函数式编程中,重新分配变量也是允许的。 它正在更改(部分)不允许的值。 正如 dsimcha 已经回答的那样,对于非常大的结构(可能在应用程序的根部),替换整个结构对我来说似乎不可行。 想一想。 应用程序的状态最终都包含在应用程序的入口点方法中。 如果绝对没有状态可以在不被替换的情况下改变,那么每次击键都必须重新启动应用程序。 :(

Local (method) variables certainly never have to be assigned to twice. But even in functional programming re-assigning a variable is allowed. It's changing (part of) the value that's not allowed. And as dsimcha already answered, for very large structures (perhaps at the root of an application) it doesn't seem feasible to me to replace the entire structure. Think about it. The state of an application is all contained ultimately by the entrypoint method of your application. If absolutely no state can change without being replaced, you would have to restart your application with every keystroke. :(

西瓜 2024-07-21 10:44:58

如果您有一个构建惰性列表/树然后再次减少它的函数,函数编译器可能能够使用 森林砍伐

如果很棘手的话,可能就不会了。 那么你的运气、表现和表现就有点不走运了。 内存方面,除非您可以迭代并使用可变变量。

If you have a function that builds a lazy list/tree then reduces it again, a functional compiler may be able to optimize it using deforestation.

If it's tricky, it might not. Then you're sort of out of luck, performance & memory wise, unless you can iterate and use a mutable variable.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文