Clojure 中 s 表达式列表的递归

发布于 2024-12-14 05:47:17 字数 1680 浏览 1 评论 0 原文

为了了解一些背景,我正在学习 Clojure 以及更广泛的 Lisp 开发。在我的 Lisp 之路上,我目前正在学习“Little”系列,努力巩固函数式编程和基于递归的解决方案解决的基础。在“The Little Scheduler”中,我完成了许多练习,但是,我在将其中一些练习转换为 Clojure 方面遇到了一些困难。更具体地说,我正在努力将它们转换为使用“recur”以启用 TCO。例如,下面是“occurrs*”函数的基于 Clojure 的实现(来自 Little Schemer),它计算 S 表达式列表中出现的原子的出现次数:

(defn atom? [l]
  (not (list? l)))

(defn occurs [a lst]
  (cond
   (empty? lst) 0
   (atom? (first lst))
    (cond
     (= a (first lst)) (inc (occurs a (rest lst)))
     true (occurs a (rest lst)))
   true (+ (occurs a (first lst))
           (occurs a (rest lst)))))

基本上,(occurrs 'abc ' (abc (def abc) (abc (abc def) (def (((((abc)))))))) 的计算结果为 5。明显的问题是这个定义消耗了堆栈帧并且如果给定的 S 表达式列表太深,将会导致堆栈爆裂。

现在,我了解重构递归函数以使用累加器参数来将递归调用放入尾部位置(以允许 TCO)的选项,但我正在努力是否该选项甚至适用于此类情况。

如果我尝试使用“recur”和累加器参数来重构它,我会得到以下结果:

(defn recur-occurs [a lst]
  (letfn [(myoccurs [a lst count]
            (cond
             (empty? lst) 0
             (atom? (first lst))
             (cond
              (= a (first lst)) (recur a (rest lst) (inc count))
              true (recur a (rest lst) count))
             true (+ (recur a (first lst) count)
                     (recur a (rest lst) count))))]
    (myoccurs a lst 0)))

所以,我觉得我已经快到了,但还没有完全实现。明显的问题是我的“else”子句,其中列表的头部不是原子。从概念上讲,我想将列表中第一个元素的重复结果与列表其余元素的重复结果相加。我正在努力思考如何重构它,以便可以将重复项移动到尾部位置。

“累加器”模式是否有其他技术来实现将递归调用放入我应该在此处应用的尾部位置,或者问题是否只是更“基本”并且没有一个干净的基于 Clojure 的解决方案由于 JVM 缺乏 TCO?如果是后者,一般来说,Clojure 程序需要在 S 表达式列表上重复使用的一般模式应该是什么?就其价值而言,我已经看到使用带有惰性序列技术的多方法(Halloway 的“Programming Clojure”第 151 页作为参考)来“用惰性替换递归” - 但我不确定如何应用该模式在这个示例中,我没有尝试构建列表,而是计算单个整数值。

预先感谢您对此的任何指导。

To set some context, I'm in the process of learning Clojure, and Lisp development more generally. On my path to Lisp, I'm currently working through the "Little" series in an effort to solidify a foundation in functional programming and recursive-based solution solving. In "The Little Schemer," I've worked through many of the exercises, however, I'm struggling a bit to convert some of them to Clojure. More specifically, I'm struggling to convert them to use "recur" so as to enable TCO. For example, here is a Clojure-based implementation to the "occurs*" function (from Little Schemer) which counts the number of occurrences of an atom appearing within a list of S-expressions:

(defn atom? [l]
  (not (list? l)))

(defn occurs [a lst]
  (cond
   (empty? lst) 0
   (atom? (first lst))
    (cond
     (= a (first lst)) (inc (occurs a (rest lst)))
     true (occurs a (rest lst)))
   true (+ (occurs a (first lst))
           (occurs a (rest lst)))))

Basically, (occurs 'abc '(abc (def abc) (abc (abc def) (def (((((abc))))))))) will evaluate to 5. The obvious problem is that this definition consumes stack frames and will blow the stack if given a list of S-expressions too deep.

Now, I understand the option of refactoring recursive functions to use an accumulator parameter to enable putting the recursive call into the tail position (to allow for TCO), but I'm struggling if this option is even applicable to situations such as this one.

Here's how far I get if I try to refactor this using "recur" along with using an accumulator parameter:

(defn recur-occurs [a lst]
  (letfn [(myoccurs [a lst count]
            (cond
             (empty? lst) 0
             (atom? (first lst))
             (cond
              (= a (first lst)) (recur a (rest lst) (inc count))
              true (recur a (rest lst) count))
             true (+ (recur a (first lst) count)
                     (recur a (rest lst) count))))]
    (myoccurs a lst 0)))

So, I feel like I'm almost there, but not quite. The obvious problem is my "else" clause in which the head of the list is not an atom. Conceptually, I want to sum the result of recurring over the first element in the list with the result of recurring over the rest of the list. I'm struggling in my head on how to refactor this such that the recurs can be moved to the tail position.

Are there additional techniques to the "accumulator" pattern to achieving getting your recursive calls put into the tail position that I should be applying here, or, is the issue simply more "fundamental" and that there isn't a clean Clojure-based solution due to the JVM's lack of TCO? If the latter, generally speaking, what should be the general pattern for Clojure programs to use that need to recur over a list of S-expressions? For what it's worth, I've seen the multi method w/lazy-seq technique used (page 151 of Halloway's "Programming Clojure" for reference) to "Replace Recursion with Laziness" - but I'm not sure how to apply that pattern to this example in which I'm not attempting to build a list, but to compute a single integer value.

Thank you in advance for any guidance on this.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

若相惜即相离 2024-12-21 05:47:17

首先,我必须建议您在阅读《小阴谋家》时不要太担心堆栈溢出等实现障碍。当你在愤怒中编程时,认真对待诸如缺乏尾调用优化之类的问题是件好事,但这本书的要点是教你递归地思考。转换示例累加器传递风格当然是一个很好的实践,但它本质上是放弃递归而支持迭代。

然而,我必须先剧透警告,有一种方法可以保持相同的递归算法,而不受 JVM 堆栈的突发奇想的影响。我们可以使用连续传递风格以额外的匿名函数参数 k 的形式创建我们自己的堆栈:

(defn occurs-cps [a lst k]
  (cond
   (empty? lst) (k 0) 
   (atom? (first lst))
   (cond
    (= a (first lst)) (occurs-cps a (rest lst)
                                  (fn [v] (k (inc v))))
    :else (occurs-cps a (rest lst) k))
   :else (occurs-cps a (first lst)
                     (fn [fst]
                       (occurs-cps a (rest lst)
                                   (fn [rst] (k (+ fst rst))))))))

我们不是通过非尾部函数调用隐式创建堆栈,而是将“什么是left to do”在每次调用发生后,并将其作为下一个延续k传递。当我们调用它时,我们从一个表示无事可做的 k 开始,即恒等函数:

scratch.core=> (occurs-cps 'abc 
                           '(abc (def abc) (abc (abc def) (def (((((abc)))))))) 
                           (fn [v] v))
5

我不会进一步讨论如何进行 CPS 的细节,因为这是稍后的章节的 TLS。不过,我要指出的是,这当然还不能完全起作用:

scratch.core=> (def ls (repeat 20000 'foo))          
#'scratch.core/ls
scratch.core=> (occurs-cps 'foo ls (fn [v] v))       
java.lang.StackOverflowError (NO_SOURCE_FILE:0)

CPS 允许我们将所有重要的堆栈构建调用移至尾部位置,但在 Clojure 中,我们需要采取额外的步骤,将它们替换为 < code>recur:

(defn occurs-cps-recur [a lst k]
  (cond
   (empty? lst) (k 0)
   (atom? (first lst))
   (cond
    (= a (first lst)) (recur a (rest lst)
                             (fn [v] (k (inc v))))
    :else (recur a (rest lst) k))
   :else (recur a (first lst)
                (fn [fst]
                  (recur a (rest lst) ;; Problem
                         (fn [rst] (k (+ fst rst))))))))

唉,这出了问题:java.lang.IllegalArgumentException:与 recur 参数计数不匹配,预期:1 个参数,得到:3 (core.clj:39)。最后一个 recur 实际上指的是它上面的 fn,我们用它来表示我们的延续! occurrs-cps-recur 的调用,我们可以获得良好的行为,但病态嵌套的输入仍然会溢出堆栈:

scratch.core=> (occurs-cps-recur 'foo ls (fn [v] v))
20000
scratch.core=> (def nested (reduce (fn [onion _] (list onion)) 
                                   'foo (range 20000)))
#'scratch.core/nested
scratch.core=> (occurs-cps-recur 'foo nested (fn [v] v))
Java.lang.StackOverflowError (NO_SOURCE_FILE:0)

大多数时候,通过将 recur 更改为对 调用 occurrs-* 并期望它返回答案,我们可以让它立即返回一个 thunk。当我们调用那个 thunk 时,它会开始执行一些工作,直到它执行递归调用,这反过来又会返回另一个 thunk。这是trampolined 风格,“反弹”我们的thunk 的函数是trampoline。每次我们进行递归调用时返回一个 thunk 会将堆栈大小限制为一次调用一次,因此我们唯一的限制是堆:

(defn occurs-cps-tramp [a lst k]
  (fn [] 
    (cond
     (empty? lst) (k 0) 
     (atom? (first lst))
     (cond
      (= a (first lst)) (occurs-cps-tramp a (rest lst)
                                          (fn [v] (k (inc v))))
      :else (occurs-cps-tramp a (rest lst) k))
     :else (occurs-cps-tramp a (first lst)
                             (fn [fst]
                               (occurs-cps-tramp a (rest lst)
                                                 (fn [rst] (k (+ fst rst)))))))))

(declare done answer)

(defn my-trampoline [th]
  (if done
    answer
    (recur (th))))

(defn empty-k [v]
  (set! answer v)
  (set! done true))

(defn run []
  (binding [done false answer 'whocares]
    (my-trampoline (occurs-cps-tramp 'foo nested empty-k))))

;; scratch.core=> (run)                             
;; 1

请注意,Clojure 有一个内置的 trampoline(对返回类型有一些限制)。相反,使用它,我们不需要专门的 empty-k

scratch.core=> (trampoline (occurs-cps-tramp 'foo nested (fn [v] v)))
1

蹦床确实是一项很酷的技术,但蹦床程序的先决条件是它必须只包含尾部调用; CPS 是这里真正的明星。它允许您以自然递归的清晰度定义算法,并通过正确性保留转换,在具有单个循环和堆的任何主机上有效地表达它。

Firstly, I must advise you to not worry much about implementation snags like stack overflows as you make your way through The Little Schemer. It is good to be conscientious of issues like the lack of tail call optimization when you're programming in anger, but the main point of the book is to teach you to think recursively. Converting the examples accumulator-passing style is certainly good practice, but it's essentially ditching recursion in favor of iteration.

However, and I must preface this with a spoiler warning, there is a way to keep the same recursive algorithm without being subject to the whims of the JVM stack. We can use continuation-passing style to make our own stack in the form of an extra anonymous function argument k:

(defn occurs-cps [a lst k]
  (cond
   (empty? lst) (k 0) 
   (atom? (first lst))
   (cond
    (= a (first lst)) (occurs-cps a (rest lst)
                                  (fn [v] (k (inc v))))
    :else (occurs-cps a (rest lst) k))
   :else (occurs-cps a (first lst)
                     (fn [fst]
                       (occurs-cps a (rest lst)
                                   (fn [rst] (k (+ fst rst))))))))

Instead of the stack being created implicitly by our non-tail function calls, we bundle up "what's left to do" after each call to occurs, and pass it along as the next continuation k. When we invoke it, we start off with a k that represents nothing left to do, the identity function:

scratch.core=> (occurs-cps 'abc 
                           '(abc (def abc) (abc (abc def) (def (((((abc)))))))) 
                           (fn [v] v))
5

I won't go further into the details of how to do CPS, as that's for a later chapter of TLS. However, I will note that this of course doesn't yet work completely:

scratch.core=> (def ls (repeat 20000 'foo))          
#'scratch.core/ls
scratch.core=> (occurs-cps 'foo ls (fn [v] v))       
java.lang.StackOverflowError (NO_SOURCE_FILE:0)

CPS lets us move all of our non-trivial, stack-building calls to tail position, but in Clojure we need to take the extra step of replacing them with recur:

(defn occurs-cps-recur [a lst k]
  (cond
   (empty? lst) (k 0)
   (atom? (first lst))
   (cond
    (= a (first lst)) (recur a (rest lst)
                             (fn [v] (k (inc v))))
    :else (recur a (rest lst) k))
   :else (recur a (first lst)
                (fn [fst]
                  (recur a (rest lst) ;; Problem
                         (fn [rst] (k (+ fst rst))))))))

Alas, this goes wrong: java.lang.IllegalArgumentException: Mismatched argument count to recur, expected: 1 args, got: 3 (core.clj:39). The very last recur actually refers to the fn right above it, the one we're using to represent our continuations! We can get good behavior most of the time by changing just that recur to a call to occurs-cps-recur, but pathologically-nested input will still overflow the stack:

scratch.core=> (occurs-cps-recur 'foo ls (fn [v] v))
20000
scratch.core=> (def nested (reduce (fn [onion _] (list onion)) 
                                   'foo (range 20000)))
#'scratch.core/nested
scratch.core=> (occurs-cps-recur 'foo nested (fn [v] v))
Java.lang.StackOverflowError (NO_SOURCE_FILE:0)

Instead of making the call to occurs-* and expecting it to give back an answer, we can have it return a thunk immediately. When we invoke that thunk, it'll go off and do some work right up until it does a recursive call, which in turn will return another thunk. This is trampolined style, and the function that "bounces" our thunks is trampoline. Returning a thunk each time we make a recursive call bounds our stack size to one call at a time, so our only limit is the heap:

(defn occurs-cps-tramp [a lst k]
  (fn [] 
    (cond
     (empty? lst) (k 0) 
     (atom? (first lst))
     (cond
      (= a (first lst)) (occurs-cps-tramp a (rest lst)
                                          (fn [v] (k (inc v))))
      :else (occurs-cps-tramp a (rest lst) k))
     :else (occurs-cps-tramp a (first lst)
                             (fn [fst]
                               (occurs-cps-tramp a (rest lst)
                                                 (fn [rst] (k (+ fst rst)))))))))

(declare done answer)

(defn my-trampoline [th]
  (if done
    answer
    (recur (th))))

(defn empty-k [v]
  (set! answer v)
  (set! done true))

(defn run []
  (binding [done false answer 'whocares]
    (my-trampoline (occurs-cps-tramp 'foo nested empty-k))))

;; scratch.core=> (run)                             
;; 1

Note that Clojure has a built-in trampoline (with some limitations on the return type). Using that instead, we don't need a specialized empty-k:

scratch.core=> (trampoline (occurs-cps-tramp 'foo nested (fn [v] v)))
1

Trampolining is certainly a cool technique, but the prerequisite to trampoline a program is that it must contain only tail calls; CPS is the real star here. It lets you define your algorithm with the clarity of natural recursion, and through correctness-preserving transformations, express it efficiently on any host that has a single loop and a heap.

深空失忆 2024-12-21 05:47:17

您无法使用固定数量的内存来执行此操作。您可以消耗堆栈或堆;这就是你要做的决定。如果我在 Clojure 中编写此内容,我会使用 mapreduce 而不是手动递归来完成:

(defn occurs [x coll]
  (if (coll? coll)
    (reduce + (map #(occurs x %) coll))
    (if (= x coll)
      1, 0)))

请注意,如果您使用 tree-seq<,则存在更短的解决方案/code> 或 flatten,但那时大部分问题都消失了,所以没有太多需要学习的东西。

编辑

这是一个不使用任何堆栈的版本,而是让它的队列变得越来越大(耗尽堆)。

(defn heap-occurs [item coll]
  (loop [count 0, queue coll]
    (if-let [[x & xs] (seq queue)]
      (if (coll? x)
        (recur count (concat x xs))
        (recur (+ (if (= item x) 1, 0)
                  count)
               xs))
      count)))

You can't do this with a fixed amount of memory. You can consume stack, or heap; that's the decision you get to make. If I were writing this in Clojure I would do it with map and reduce rather than with manual recursion:

(defn occurs [x coll]
  (if (coll? coll)
    (reduce + (map #(occurs x %) coll))
    (if (= x coll)
      1, 0)))

Note that shorter solutions exist if you use tree-seq or flatten, but at that point most of the problem is gone so there's not much to learn.

Edit

Here's a version that doesn't use any stack, instead letting its queue get larger and larger (using up heap).

(defn heap-occurs [item coll]
  (loop [count 0, queue coll]
    (if-let [[x & xs] (seq queue)]
      (if (coll? x)
        (recur count (concat x xs))
        (recur (+ (if (= item x) 1, 0)
                  count)
               xs))
      count)))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文