Clojure 的 memoize 是否会强制对其参数进行求值?

发布于 2024-12-29 22:24:28 字数 246 浏览 1 评论 0原文

在 Clojure 中,如果我记住一个函数,请将其命名为 f 并在参数 a 上调用它。

如果 a 是一个很大的惰性值,memoize 是否会根据匹配 thunk 返回一个值,而不是强制评估 a 并匹配结果?

其中 thunk 是惰性序列中未评估的部分。

如果情况并非如此,是否有内置方法可以实现此行为?

谢谢!

In Clojure if I memoize a function, name it f and call it on an argument a.

If a is a large lazy value, does memoize return a value based on matching the thunk as opposed to forcing the evaluation of a and matching on the result?

Where a thunk is the unevaluated part of the lazy sequence.

If this isn't the case is there a built-in way to get this behaviour?

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

初与友歌 2025-01-05 22:24:28

正如 mikera 所说,memoize 不处理无限的惰性序列。我添加这个答案是为了提供对此的实现原因的简短描述(加上基于身份的记忆方案的两种想法,一种简单,一种更复杂)。 (编辑:实际上有一个简单的基于身份的通用解决方案,请参见下文。)

为什么它不起作用

memoize 使用哈希映射来存储从参数到值和哈希映射在确定对象是否是其键之一时使用 clojure.lang.Util/hasheq(除了仅返回 false 的空映射)。由于 hasheq 的惰性 seq 实现会强制执行整个 seq,因此询问任何映射是否无限惰性 seq 是其键之一将导致其进入无限的、内存耗尽的循环。因此,memoize 也是如此。

严格来说,最初的映射是一个数组映射。 (在 Clojure 中,出于效率考虑,小映射通常是数组映射;如果将足够的项关联到数组映射上,则返回值将变为哈希映射。)但是,非空数组映射由于涉及等价检查方法的类似原因,无法处理无限惰性序列。

解决方案

System/identityHashCode 会返回给定对象返回的任何 hashCode 如果它使用默认实现(无论其 hashCode 是否被覆盖) 。

(defprotocol PWrapped
  (-unwrap [this]))
PWrapped

(defn wrap [o]
  (reify
    Object
    (hashCode [_] (System/identityHashCode o))
    PWrapped
    (-unwrap [_] o)))

;;; adapted from clojure.core/memoize, (C) Rich Hickey, EPL-licenced
(defn memoize
  "Returns a memoized version of a referentially transparent function. The
  memoized version of the function keeps a cache of the mapping from arguments
  to results and, when calls with the same arguments are repeated often, has
  higher performance at the expense of higher memory use."
  {:added "1.0"
   :static true}
  [f]
  (let [mem (atom {})]
    (fn [& args]
      (if-let [e (find @mem (map wrap args))]
        (val e)
        (let [ret (apply f args)]
          (swap! mem assoc (map wrap args) ret)
          ret)))))

现在您可以执行此操作(这不适用于常规 memoize):

user> (def r (lazy-cat (range 10000) (prn :foo) (range)))
#'user/r
user> (def f (memoize #(apply + (take 100 %))))
#'user/f
user> (f [1 2 3])
6
user> (f r)
4950

替代实现的原始讨论如下。

我认为没有内置解决方案使用指针相等。要实现像 memoize 这样通用的功能,您必须使用基于指针相等的散列(即 System/identityHashCode)来实现映射结构。或者,您可以使用 clojure.lang.PersistentQueue 构建一个简单的“最近使用的”缓存。

As stated by mikera, memoize doesn't handle infinite lazy seqs. I'm adding this answer to provide a short description of the implementation reasons for this (plus two ideas for identity-based memoization schemes, one simple, one more complex). (Edit: actually there is a simple general identity-based solution, see below.)

Why it doesn't work

memoize uses a hash map to store the mapping from arguments to values and hash maps use clojure.lang.Util/hasheq when determining if an object is one of their keys (except empty maps which just return false). Since hasheq's implementation for lazy seqs forces the entirety of the seq, asking any map if an infinite lazy seq is one of its keys will cause it to go into an infinite, memory-exhausting loop. Thus the same goes for memoize.

Strictly speaking, initially the map is an array map. (In Clojure, for reasons of efficiency, small maps are usually array maps; if enough items are assoc'd onto an array map, the return value becomes a hash map.) However non-empty array maps also fail to handle infinite lazy seqs due to a similar reason involving an equivalence-checking method.

A solution

System/identityHashCode returns whatever hashCode would return for a given objects if it used the default implementation (whether or not its hashCode is overridden).

(defprotocol PWrapped
  (-unwrap [this]))
PWrapped

(defn wrap [o]
  (reify
    Object
    (hashCode [_] (System/identityHashCode o))
    PWrapped
    (-unwrap [_] o)))

;;; adapted from clojure.core/memoize, (C) Rich Hickey, EPL-licenced
(defn memoize
  "Returns a memoized version of a referentially transparent function. The
  memoized version of the function keeps a cache of the mapping from arguments
  to results and, when calls with the same arguments are repeated often, has
  higher performance at the expense of higher memory use."
  {:added "1.0"
   :static true}
  [f]
  (let [mem (atom {})]
    (fn [& args]
      (if-let [e (find @mem (map wrap args))]
        (val e)
        (let [ret (apply f args)]
          (swap! mem assoc (map wrap args) ret)
          ret)))))

Now you can do this (which wouldn't work with the regular memoize):

user> (def r (lazy-cat (range 10000) (prn :foo) (range)))
#'user/r
user> (def f (memoize #(apply + (take 100 %))))
#'user/f
user> (f [1 2 3])
6
user> (f r)
4950

Original discussion of alternative implementations follows.

I don't think there is a built-in solution using pointer equality. To implement one as general as memoize, you'd have to implement a map structure using pointer-equality-based hashing (viz. System/identityHashCode). Or you could build a simple "most recently used" cache with clojure.lang.PersistentQueue.

我不在是我 2025-01-05 22:24:28

Memoize 将根据您传递给它的任何参数的进行记忆。

因此,memoize 可以很好地使用惰性序列作为参数:因为它会查看序列的值,如果需要,它会强制执行计算。

然而,这确实意味着您不能使用无限惰性序列,因为使用 memoize 可以强制对它们进行评估,这显然不会有好结果......

Memoize will memoize on the basis of the value of whatever argument you pass to it.

Therefore memoize works fine with lazy sequences as arguments: because it looks at the value of the sequence it will force evaluation if needed.

This does however mean that you can't use infinite lazy sequences, since the use of memoize can force evaluation of them which will clearly not end well.....

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文