如何避免 Clojure 对我想要短路的惰性序列的分块行为？

发布于 2024-09-13 05:39:35 字数 1094 浏览 13 评论 0原文

我有一个很长的、惰性的序列，我想减少并惰性地测试它。一旦两个连续元素彼此不 = （或其他谓词），我就想停止使用该列表，因为该列表的生成成本很高。是的，这听起来像 take-while，但请进一步阅读。

我想写一些像这样简单而优雅的东西（假装every？像reduce一样工作）：

(every? = (range 100000000))

但这不能懒惰地工作，所以它挂在无限的seqs上。我发现这几乎按照我想要的方式工作：

(apply = (range 100000000))

但是，我注意到序列分块导致创建和测试额外的、不必要的元素。至少，我认为以下代码中发生的情况是这样的：

;; Displays chunking behavior in groups of four on my system and prints 1 2 3 4
(apply = (map #(do (println %) %) (iterate inc 1)))

;; This prints 0 to 31
(apply = (map #(do (println %) %) (range)))

我找到了一种使用 take-while 和 count 来检查元素数量的解决方法采取了，不过这样比较麻烦。

我是否应该礼貌地向 Rich Hickey 建议他将 reduce 和 every 进行某种组合？ 正确地短路，还是我错过了一些已经存在的明显方法？

编辑：有两个好心人发布了避免在惰性序列上分块的解决方案，但是在执行 apply 时如何避免分块，这似乎是在四个分块组中消耗？

编辑 #2： 正如斯图尔特·塞拉 (Stuart Sierra) 所指出的和我独立发现的那样，这实际上并不是分块。这只是正常表现，所以我就结束这个问题并给他答案。对于那些感兴趣的人，我在一个单独的答案中包含了一个小函数来减少问题的部分。

原文

I have a long, lazy sequence that I want to reduce and test lazily. As soon as two sequential elements are not = (or some other predicate) to each other, I want to stop consuming the list, which is expensive to produce. Yes, this sounds like take-while, but read further.

I wanted to write something simple and elegant like this (pretending for a minute that every? works like reduce):

(every? = (range 100000000))

But that does not work lazily and so it hangs on infinite seqs. I discovered that this works almost as I wanted:

(apply = (range 100000000))

However, I noticed that sequence chunking was resulting in extra, unnecessary elements being created and tested. At least, this is what I think this is what happening in the following bit of code:

;; Displays chunking behavior in groups of four on my system and prints 1 2 3 4
(apply = (map #(do (println %) %) (iterate inc 1)))

;; This prints 0 to 31
(apply = (map #(do (println %) %) (range)))

I found a workaround using take-while, and count to check the number of elements taken, but that is rather cumbersome.

Should I politely suggest to Rich Hickey that he make some combination of reduce and every? short circuit properly, or am I missing some obvious way that already exists?

EDIT: Two kind people posted solutions for avoiding chunking on the lazy sequences, but how do I avoid chunking when doing the apply, which seems to be consuming in chunked groups of four?

EDIT #2: As Stuart Sierra notes and I discovered independently, this isn't actually chunking. It's just apply acting normally, so I'll call this closed and give him the answer. I included a small function in a separate answer to do the reduce'ing part of the problem, for those who are interested.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

放飞的风筝 2024-09-20 05:39:35

更正两次：一种更简单的方法来对惰性序列进行分块：

(defn unchunk [s]
  (when (seq s)
    (lazy-seq
      (cons (first s)
            (unchunk (next s))))))

第一个版本省略了(when ...，因此在输入序列结束后它返回了一个无限的nil seq。

第二个版本使用 first 而不是 seq 因此它停止在 nil

RE: 你的另一个问题，“我如何避免分块。进行应用，这似乎以四个为一组的块进行消耗”：

这是由于 = 的定义，当给定一系列参数时，它会强制使用前 4 个：

(defn =
  ;; ... other arities ...
  ([x y & more]
   (if (= x y)
     (if (next more)
       (recur y (first more) (next more))
       (= y (first more)))
     false)))

CORRECTED TWICE: A simpler way to un-chunk a lazy sequence:

(defn unchunk [s]
  (when (seq s)
    (lazy-seq
      (cons (first s)
            (unchunk (next s))))))

First version omitted (when ... so it returned an infinite seq of nil's after the input sequence ended.

Second version used first instead of seq so it stopped on nil.

RE: your other question, "how do I avoid chunking when doing the apply, which seems to be consuming in chunked groups of four":

This is due to the definition of =, which, when given a sequence of arguments, forces the first 4:

(defn =
  ;; ... other arities ...
  ([x y & more]
   (if (= x y)
     (if (next more)
       (recur y (first more) (next more))
       (= y (first more)))
     false)))

回复收藏 0 原文

や三分注定 2024-09-20 05:39:35

查看 clojure.core 中 apply 的定义，可以清楚地了解为什么当 apply 与无限序列一起使用时，它会以四个为一组进行分块。 Reduce 也不会短路...所以我只能编写自己的解决方案：

(defn reducep
  "Like reduce, but for use with a predicate. Short-circuits on first false."
  ([p coll]
     (if-let [s (seq coll)]
       (reducep p (first s) (next s))
       (p)))
  ([p val coll]
     (if-let [s (seq coll)]
       (if-let [v (p val (first s))]
         (recur p (first s) (next s))
         false)
       true)))

然后使用 Stuart 的 uncunkk （带有额外的 and）

(defn unchunk [s]
  (lazy-seq
   (cons (first s)
         (and (next s)
              (unchunk (next s))))))

我得到：

(reducep = (map #(do (print %) %) (unchunk (range)))) ;; Prints 01, returns false
(reducep = (map #(do (print %) %) (repeat 20 1))) ;; returns true
(reducep = (map #(do (print %) %) (unchunk [0 0 2 4 5]))) ;; Returns false
(reducep = (map #(do (print %) %) (unchunk [2 2 2 2 2]))) ;; returns true

如果那样也适合你，修改一下。

编辑：Stuart 在编辑后对 uncunkk 的修改版本可能比之前这篇文章中的版本更好。

Looking in clojure.core at the definition of apply makes it obvious as to why it was chunking in groups of four when apply is used with an infinite sequence. Reduce doesn't short circuit, either...so I'm left writing my own solution:

(defn reducep
  "Like reduce, but for use with a predicate. Short-circuits on first false."
  ([p coll]
     (if-let [s (seq coll)]
       (reducep p (first s) (next s))
       (p)))
  ([p val coll]
     (if-let [s (seq coll)]
       (if-let [v (p val (first s))]
         (recur p (first s) (next s))
         false)
       true)))

Then using Stuart's unchunk (with an extra and)

(defn unchunk [s]
  (lazy-seq
   (cons (first s)
         (and (next s)
              (unchunk (next s))))))

I get:

(reducep = (map #(do (print %) %) (unchunk (range)))) ;; Prints 01, returns false
(reducep = (map #(do (print %) %) (repeat 20 1))) ;; returns true
(reducep = (map #(do (print %) %) (unchunk [0 0 2 4 5]))) ;; Returns false
(reducep = (map #(do (print %) %) (unchunk [2 2 2 2 2]))) ;; returns true

If that works for you too, mod this up.

EDIT: Stuart's modified version of unchunk after his edit is probably preferable to the one in this earlier post.

回复收藏 0 原文

自由如风 2024-09-20 05:39:35

我在 4clojure 问题达到时间限制时发现了这篇文章
我找到了另一种方法来避免 32-chunks：

;; add another dummy sequence parameter to the map:
(apply = (map #(do (prn %2) %) (range) (range)))

map 的更高数量形式似乎不使用分块序列（clojure 1.5）

你必须对第二个参数做一些事情，所以
明确说明这一点可能会更好：

(apply = (map (fn [i _] (prn i) i) (range) (range)))

这不像其他解决方案那么简洁，但可能适合快速和肮脏的解决方案
使用，例如测试“由于分块而变慢吗？”。

关于apply，您可以使用partition从序列中获取对并=它们：

(every? #(apply = %) (partition 2 1
    (map (fn [i _] (prn i) i) (range) (range))))

尽管reducep看起来也很有用。

附言。我不想给人留下分块序列速度较慢的印象，事实并非如此。
我的问题是 4clojure 测试用例在我的 seq 生成函数上调用“first”
一系列值，所以分块意味着我做了 32 倍的工作。
（PPS。我的代码还是太慢）

I found this post while hitting a time limit on a 4clojure problem
and I found another way to avoid the 32-chunks:

;; add another dummy sequence parameter to the map:
(apply = (map #(do (prn %2) %) (range) (range)))

The higher arity forms of map don't appear to use chunked sequences (clojure 1.5)

You have to do something with the second parameter, so being
explicit about that might be better:

(apply = (map (fn [i _] (prn i) i) (range) (range)))

This isn't as neat as the other solutions but might be good for quick and dirty
uses, like testing "is this slow because of chunking?".

Regarding apply, you could use partition to get pairs from the sequence and = them:

(every? #(apply = %) (partition 2 1
    (map (fn [i _] (prn i) i) (range) (range))))

Although reducep looks useful too.

PS. I don't want to give the impression that the chunked sequence is slower, it's not.
My issue is that the 4clojure test case calls "first" on my seq generating function over
a range of values, so chunking means I do 32 times the work.
(PPS. My code is still too slow)

回复收藏 0 原文

~没有更多了~