Clojure 按过滤器分区

发布于 2024-11-01 11:22:39 字数 717 浏览 4 评论 0原文

在 Scala 中,分区方法将一个序列分成两个独立的序列——谓词为 true 的序列和为 false 的序列:

scala> List(1, 5, 2, 4, 6, 3, 7, 9, 0, 8).partition(_ % 2 == 0)
res1: (List[Int], List[Int]) = (List(2, 4, 6, 0, 8),List(1, 5, 3, 7, 9))

请注意,Scala 实现仅遍历序列一次

在 Clojure 中,partition-by 函数将序列拆分为多个子序列,每个子序列满足或不满足谓词:

user=> (partition-by #(= 0 (rem % 2)) [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
((1 5) (2 4 6) (3 7 9) (0 8))

split-by 生成:

user=> (split-with #(= 0 (rem % 2)) [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
[() (1 5 2 4 6 3 7 9 0 8)]

是否有一个内置 Clojure 函数可以与 Scala partition 方法执行相同的操作?

In Scala, the partition method splits a sequence into two separate sequences -- those for which the predicate is true and those for which it is false:

scala> List(1, 5, 2, 4, 6, 3, 7, 9, 0, 8).partition(_ % 2 == 0)
res1: (List[Int], List[Int]) = (List(2, 4, 6, 0, 8),List(1, 5, 3, 7, 9))

Note that the Scala implementation only traverses the sequence once.

In Clojure the partition-by function splits the sequence into multiple sub-sequences, each the longest subset that either does or does not meet the predicate:

user=> (partition-by #(= 0 (rem % 2)) [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
((1 5) (2 4 6) (3 7 9) (0 8))

while the split-by produces:

user=> (split-with #(= 0 (rem % 2)) [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
[() (1 5 2 4 6 3 7 9 0 8)]

Is there a built-in Clojure function that does the same thing as the Scala partition method?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

剑心龙吟 2024-11-08 11:22:39

我相信您正在寻找的功能是 clojure.core/group-by。它返回一个键映射到原始序列中的项目列表,分组函数为其返回该键。如果您使用真/假生成谓词,您将得到您正在寻找的分割。

user=> (group-by even? [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
{false [1 5 3 7 9], true [2 4 6 0 8]}

如果您查看实现,它满足您只使用一次pass的要求。另外,它在底层使用瞬态,因此它应该比迄今为止发布的其他解决方案更快。需要注意的是,您应该确定分组函数正在生成的键。如果它生成 nil 而不是 false,那么您的地图将在 nil 键下列出失败的项目。如果您的分组函数生成非零值而不是 true,那么您可以在多个键下列出传递值。这不是一个大问题,只是要注意您需要为分组函数使用 true/false 生成谓词。

group-by 的好处是它比仅仅将序列拆分为通过和失败的项目更通用。您可以轻松地使用此功能将序列分组为所需的任意多个类别。非常有用且灵活。这可能就是为什么group-by位于clojure.core而不是单独中。

I believe the function you are looking for is clojure.core/group-by. It returns a map of keys to lists of items in the original sequence for which the grouping function returns that key. If you use a true/false producing predicate, you will get the split that you are looking for.

user=> (group-by even? [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
{false [1 5 3 7 9], true [2 4 6 0 8]}

If you take a look at the implementation, it fulfills your requirement that it only use one pass. Plus, it uses transients under the hood so it should be faster than the other solutions posted thus far. One caveat is that you should be sure of the keys that your grouping function is producing. If it produces nil instead of false, then your map will list failing items under the nil key. If your grouping function produces non-nil values instead of true, then you could have passing values listed under multiple keys. Not a big problem, just be aware that you need to use a true/false producing predicate for your grouping function.

The nice thing about group-by is that it is more general than just splitting a sequence into passing and failing items. You can easily use this function to group your sequence into as many categories as you need. Very useful and flexible. That is probably why group-by is in clojure.core instead of separate.

孤蝉 2024-11-08 11:22:39

clojure.contrib.seq-utils 的一部分:

user> (use '[clojure.contrib.seq-utils :only [separate]])
nil                                                                                                                                                         
user> (separate even? [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
[(2 4 6 0 8) (1 5 3 7 9)]                                                                                                                                   

Part of clojure.contrib.seq-utils:

user> (use '[clojure.contrib.seq-utils :only [separate]])
nil                                                                                                                                                         
user> (separate even? [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
[(2 4 6 0 8) (1 5 3 7 9)]                                                                                                                                   
背叛残局 2024-11-08 11:22:39

请注意,Jürgen、Adrian 和 Mikera 的答案都遍历了输入序列两次。

(defn single-pass-separate
  [pred coll]
  (reduce (fn [[yes no] item]
            (if (pred item)
              [(conj yes item) no]
              [yes (conj no item)]))
          [[] []]
          coll))

单通只能是热切。懒惰必须是两次通过加上无力地抓住头部。

编辑: lazy-single-pass-separate 是可能的,但很难理解。事实上,我相信这比简单的第二遍要慢。但我还没有检查过。

(defn lazy-single-pass-separate
  [pred coll]
  (let [coll       (atom coll)
        yes        (atom clojure.lang.PersistentQueue/EMPTY)
        no         (atom clojure.lang.PersistentQueue/EMPTY)
        fill-queue (fn [q]
                     (while (zero? (count @q))
                       (locking coll
                         (when (zero? (count @q))
                           (when-let [s (seq @coll)]
                             (let [fst (first s)]
                               (if (pred fst)
                                 (swap! yes conj fst)
                                 (swap! no conj fst))
                               (swap! coll rest)))))))
        queue      (fn queue [q]
                     (lazy-seq
                       (fill-queue q)
                       (when (pos? (count @q))
                         (let [item (peek @q)]
                           (swap! q pop)
                           (cons item (queue q))))))]
    [(queue yes) (queue no)]))

这是你能得到的最懒的方法:

user=> (let [[y n] (lazy-single-pass-separate even? (report-seq))] (def yes y) (def no n))
#'user/no
user=> (first yes)
">0<"
0
user=> (second no)
">1<"
">2<"
">3<"
3
user=> (second yes)
2

看看上面的内容,我会说“急切”或“两次通过”。

Please note that the answers of Jürgen, Adrian and Mikera all traverse the input sequence twice.

(defn single-pass-separate
  [pred coll]
  (reduce (fn [[yes no] item]
            (if (pred item)
              [(conj yes item) no]
              [yes (conj no item)]))
          [[] []]
          coll))

A single pass can only be eager. Lazy has to be two pass plus weakly holding onto the head.

Edit: lazy-single-pass-separate is possible but hard to understand. And in fact, I believe this is slower then a simple second pass. But I haven't checked that.

(defn lazy-single-pass-separate
  [pred coll]
  (let [coll       (atom coll)
        yes        (atom clojure.lang.PersistentQueue/EMPTY)
        no         (atom clojure.lang.PersistentQueue/EMPTY)
        fill-queue (fn [q]
                     (while (zero? (count @q))
                       (locking coll
                         (when (zero? (count @q))
                           (when-let [s (seq @coll)]
                             (let [fst (first s)]
                               (if (pred fst)
                                 (swap! yes conj fst)
                                 (swap! no conj fst))
                               (swap! coll rest)))))))
        queue      (fn queue [q]
                     (lazy-seq
                       (fill-queue q)
                       (when (pos? (count @q))
                         (let [item (peek @q)]
                           (swap! q pop)
                           (cons item (queue q))))))]
    [(queue yes) (queue no)]))

This is as lazy as you can get:

user=> (let [[y n] (lazy-single-pass-separate even? (report-seq))] (def yes y) (def no n))
#'user/no
user=> (first yes)
">0<"
0
user=> (second no)
">1<"
">2<"
">3<"
3
user=> (second yes)
2

Looking at the above, I'd say "go eager" or "go two pass."

另类 2024-11-08 11:22:39

写一些有用的东西并不难:

(defn partition-2 [pred coll]
  ((juxt 
    (partial filter pred) 
    (partial filter (complement pred))) 
  coll))

(partition-2 even? (range 10))

=> [(0 2 4 6 8) (1 3 5 7 9)]

It's not hard to write something that does the trick:

(defn partition-2 [pred coll]
  ((juxt 
    (partial filter pred) 
    (partial filter (complement pred))) 
  coll))

(partition-2 even? (range 10))

=> [(0 2 4 6 8) (1 3 5 7 9)]
巨坚强 2024-11-08 11:22:39

也许请参阅 https://github.com/amalloy/clojure -useful/blob/master/src/useful.clj#L50 - 是否遍历序列两次取决于“遍历序列”的含义。

编辑:现在我不在手机上,我想链接而不是粘贴是愚蠢的:

(defn separate
  [pred coll]
  (let [coll (map (fn [x]
                    [x (pred x)])
                  coll)]
    (vec (map #(map first (% second coll))
              [filter remove]))))

Maybe see https://github.com/amalloy/clojure-useful/blob/master/src/useful.clj#L50 - whether it traverses the sequence twice depends on what you mean by "traverse the sequence".

Edit: Now that I'm not on my phone, I guess it's silly to link instead of paste:

(defn separate
  [pred coll]
  (let [coll (map (fn [x]
                    [x (pred x)])
                  coll)]
    (vec (map #(map first (% second coll))
              [filter remove]))))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文