如何编写通用的 memoize 函数?

发布于 2024-07-06 14:15:21 字数 396 浏览 13 评论 0原文

我正在编写一个函数来查找 三角形数字 以及自然的方法递归地写:

function triangle (x)
   if x == 0 then return 0 end
   return x+triangle(x-1)
end

但是尝试计算前 100,000 个三角形数会失败,并在一段时间后出现堆栈溢出。 这是 memoize 的理想函数,但我想要一个能够记住我传递给它的任何函数的解决方案。

I'm writing a function to find triangle numbers and the natural way to write it is recursively:

function triangle (x)
   if x == 0 then return 0 end
   return x+triangle(x-1)
end

But attempting to calculate the first 100,000 triangle numbers fails with a stack overflow after a while. This is an ideal function to memoize, but I want a solution that will memoize any function I pass to it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(15

恏ㄋ傷疤忘ㄋ疼 2024-07-13 14:15:21

Mathematica 有一种特别巧妙的记忆方法,它依赖于散列和函数调用使用相同语法的事实:

triangle[0] = 0;
triangle[x_] := triangle[x] = x + triangle[x-1]

就是这样。 它之所以有效,是因为模式匹配函数调用的规则是这样的:它总是在更通用的定义之前使用更具体的定义。

当然,正如已经指出的,这个例子有一个封闭形式的解决方案:triangle[x_] := x*(x+1)/2。 斐波那契数是添加记忆如何带来巨大加速的经典示例:

fib[0] = 1;
fib[1] = 1;
fib[n_] := fib[n] = fib[n-1] + fib[n-2]

尽管它也有一个封闭形式的等价物,尽管更混乱: http://mathworld.wolfram.com/FibonacciNumber.html

我不同意有人认为这不适合记忆,因为你可以“只使用循环”。 记忆的要点是任何重复函数调用都是 O(1) 时间。 这比 O(n) 好很多。 事实上,您甚至可以编造一个场景,其中记忆实现比封闭形式实现具有更好的性能!

Mathematica has a particularly slick way to do memoization, relying on the fact that hashes and function calls use the same syntax:

triangle[0] = 0;
triangle[x_] := triangle[x] = x + triangle[x-1]

That's it. It works because the rules for pattern-matching function calls are such that it always uses a more specific definition before a more general definition.

Of course, as has been pointed out, this example has a closed-form solution: triangle[x_] := x*(x+1)/2. Fibonacci numbers are the classic example of how adding memoization gives a drastic speedup:

fib[0] = 1;
fib[1] = 1;
fib[n_] := fib[n] = fib[n-1] + fib[n-2]

Although that too has a closed-form equivalent, albeit messier: http://mathworld.wolfram.com/FibonacciNumber.html

I disagree with the person who suggested this was inappropriate for memoization because you could "just use a loop". The point of memoization is that any repeat function calls are O(1) time. That's a lot better than O(n). In fact, you could even concoct a scenario where the memoized implementation has better performance than the closed-form implementation!

烟燃烟灭 2024-07-13 14:15:21

您还为原来的问题提出了错误的问题;)

对于这种情况,这是一个更好的方法:

triangle(n) = n * (n - 1) / 2

此外,假设公式没有如此简洁的解决方案,记忆在这里仍然是一个糟糕的方法。 在这种情况下,您最好只编写一个简单的循环。 请参阅此答案进行更全面的讨论。

You're also asking the wrong question for your original problem ;)

This is a better way for that case:

triangle(n) = n * (n - 1) / 2

Furthermore, supposing the formula didn't have such a neat solution, memoisation would still be a poor approach here. You'd be better off just writing a simple loop in this case. See this answer for a fuller discussion.

小霸王臭丫头 2024-07-13 14:15:21

我打赌这样的东西应该适用于 Lua 中的变量参数列表:

local function varg_tostring(...)
    local s = select(1, ...)
    for n = 2, select('#', ...) do
        s = s..","..select(n,...)
    end
    return s
end

local function memoize(f)
    local cache = {}
    return function (...)
        local al = varg_tostring(...)
        if cache[al] then
            return cache[al]
        else
            local y = f(...)
            cache[al] = y
            return y
        end
    end
end

您可能还可以使用 __tostring 对元表进行一些巧妙的处理,以便可以使用 tostring() 来转换参数列表。 哦,可能性。

I bet something like this should work with variable argument lists in Lua:

local function varg_tostring(...)
    local s = select(1, ...)
    for n = 2, select('#', ...) do
        s = s..","..select(n,...)
    end
    return s
end

local function memoize(f)
    local cache = {}
    return function (...)
        local al = varg_tostring(...)
        if cache[al] then
            return cache[al]
        else
            local y = f(...)
            cache[al] = y
            return y
        end
    end
end

You could probably also do something clever with a metatables with __tostring so that the argument list could just be converted with a tostring(). Oh the possibilities.

剩余の解释 2024-07-13 14:15:21

在 C# 3.0 中 - 对于递归函数,您可以执行以下操作:

public static class Helpers
{
    public static Func<A, R> Memoize<A, R>(this Func<A, Func<A,R>,  R> f)
    {
        var map = new Dictionary<A, R>();
        Func<A, R> self = null;
        self = (a) =>
        {
            R value;
            if (map.TryGetValue(a, out value))
                return value;
            value = f(a, self);
            map.Add(a, value);
            return value;
        };
        return self;
    }        
}

然后您可以创建一个记忆斐波那契函数,如下所示:

var memoized_fib = Helpers.Memoize<int, int>((n,fib) => n > 1 ? fib(n - 1) + fib(n - 2) : n);
Console.WriteLine(memoized_fib(40));

In C# 3.0 - for recursive functions, you can do something like:

public static class Helpers
{
    public static Func<A, R> Memoize<A, R>(this Func<A, Func<A,R>,  R> f)
    {
        var map = new Dictionary<A, R>();
        Func<A, R> self = null;
        self = (a) =>
        {
            R value;
            if (map.TryGetValue(a, out value))
                return value;
            value = f(a, self);
            map.Add(a, value);
            return value;
        };
        return self;
    }        
}

Then you can create a memoized Fibonacci function like this:

var memoized_fib = Helpers.Memoize<int, int>((n,fib) => n > 1 ? fib(n - 1) + fib(n - 2) : n);
Console.WriteLine(memoized_fib(40));
节枝 2024-07-13 14:15:21

在 Scala 中(未经测试):

def memoize[A, B](f: (A)=>B) = {
  var cache = Map[A, B]()

  { x: A =>
    if (cache contains x) cache(x) else {
      val back = f(x)
      cache += (x -> back)

      back
    }
  }
}

请注意,这仅适用于 arity 1 的函数,但通过柯里化,您可以使其工作。 更微妙的问题是对于任何函数 f 来说 memoize(f) != memoize(f) 。 解决这个问题的一种非常偷偷摸摸的方法如下:

val correctMem = memoize(memoize _)

我认为这不会编译,但它确实说明了这个想法。

In Scala (untested):

def memoize[A, B](f: (A)=>B) = {
  var cache = Map[A, B]()

  { x: A =>
    if (cache contains x) cache(x) else {
      val back = f(x)
      cache += (x -> back)

      back
    }
  }
}

Note that this only works for functions of arity 1, but with currying you could make it work. The more subtle problem is that memoize(f) != memoize(f) for any function f. One very sneaky way to fix this would be something like the following:

val correctMem = memoize(memoize _)

I don't think that this will compile, but it does illustrate the idea.

策马西风 2024-07-13 14:15:21

更新:评论者指出记忆化是优化递归的好方法。 诚然,我以前没有考虑过这一点,因为我通常使用一种语言 (C#),在这种语言中,构建通用记忆化并不是那么简单。 请对下面的帖子持保留态度。

我认为 Luke 可能有最合适的解决方案这个问题,但记忆通常不能解决任何堆栈溢出问题。

堆栈溢出通常是由于递归深度超过平台可以处理的深度而引起的。 语言有时支持“尾递归”,它重新使用当前调用的上下文,而不是而不是为递归调用创建新的上下文。 但很多主流语言/平台不支持这一点。 例如,C# 本身就没有对尾递归的支持。 .NET JITter 的 64 位版本可以将其应用为 IL 级别的优化,如果您需要支持 32 位平台,这几乎毫无用处。

如果您的语言不支持尾递归,那么避免堆栈溢出的最佳选择是转换为显式循环(不太优雅,但有时​​是必要的),或者找到一种非迭代算法,例如 Luke 为这个问题提供的算法。

Update: Commenters have pointed out that memoization is a good way to optimize recursion. Admittedly, I hadn't considered this before, since I generally work in a language (C#) where generalized memoization isn't so trivial to build. Take the post below with that grain of salt in mind.

I think Luke likely has the most appropriate solution to this problem, but memoization is not generally the solution to any issue of stack overflow.

Stack overflow usually is caused by recursion going deeper than the platform can handle. Languages sometimes support "tail recursion", which re-uses the context of the current call, rather than creating a new context for the recursive call. But a lot of mainstream languages/platforms don't support this. C# has no inherent support for tail-recursion, for example. The 64-bit version of the .NET JITter can apply it as an optimization at the IL level, which is all but useless if you need to support 32-bit platforms.

If your language doesn't support tail recursion, your best option for avoiding stack overflows is either to convert to an explicit loop (much less elegant, but sometimes necessary), or find a non-iterative algorithm such as Luke provided for this problem.

少女七分熟 2024-07-13 14:15:21
function memoize (f)
   local cache = {}
   return function (x)
             if cache[x] then
                return cache[x]
             else
                local y = f(x)
                cache[x] = y
                return y
             end
          end
end

triangle = memoize(triangle);

请注意,为了避免堆栈溢出,三角形仍然需要播种。

function memoize (f)
   local cache = {}
   return function (x)
             if cache[x] then
                return cache[x]
             else
                local y = f(x)
                cache[x] = y
                return y
             end
          end
end

triangle = memoize(triangle);

Note that to avoid a stack overflow, triangle would still need to be seeded.

葮薆情 2024-07-13 14:15:21

这是无需将参数转换为字符串即可工作的方法。
唯一需要注意的是它不能处理 nil 参数。 但接受的解决方案无法区分值 nil 和字符串 "nil",所以这可能没问题。

local function m(f)
  local t = { }
  local function mf(x, ...) -- memoized f
    assert(x ~= nil, 'nil passed to memoized function')
    if select('#', ...) > 0 then
      t[x] = t[x] or m(function(...) return f(x, ...) end)
      return t[x](...)
    else
      t[x] = t[x] or f(x)
      assert(t[x] ~= nil, 'memoized function returns nil')
      return t[x]
    end
  end
  return mf
end

Here's something that works without converting the arguments to strings.
The only caveat is that it can't handle a nil argument. But the accepted solution can't distinguish the value nil from the string "nil", so that's probably OK.

local function m(f)
  local t = { }
  local function mf(x, ...) -- memoized f
    assert(x ~= nil, 'nil passed to memoized function')
    if select('#', ...) > 0 then
      t[x] = t[x] or m(function(...) return f(x, ...) end)
      return t[x](...)
    else
      t[x] = t[x] or f(x)
      assert(t[x] ~= nil, 'memoized function returns nil')
      return t[x]
    end
  end
  return mf
end
╰ゝ天使的微笑 2024-07-13 14:15:21

我受到这个问题的启发,在 Lua 中实现(又一个)灵活的 memoize 函数。

https://github.com/kikito/memoize.lua

主要优点:

  • 接受可变数量的参数
  • 不使用 tostring; 相反,它以树形结构组织缓存,并使用参数来遍历它。
  • 与返回多个值的函数配合得很好。

将代码粘贴到此处作为参考:

local globalCache = {}

local function getFromCache(cache, args)
  local node = cache
  for i=1, #args do
    if not node.children then return {} end
    node = node.children[args[i]]
    if not node then return {} end
  end
  return node.results
end

local function insertInCache(cache, args, results)
  local arg
  local node = cache
  for i=1, #args do
    arg = args[i]
    node.children = node.children or {}
    node.children[arg] = node.children[arg] or {}
    node = node.children[arg]
  end
  node.results = results
end


-- public function

local function memoize(f)
  globalCache[f] = { results = {} }
  return function (...)
    local results = getFromCache( globalCache[f], {...} )

    if #results == 0 then
      results = { f(...) }
      insertInCache(globalCache[f], {...}, results)
    end

    return unpack(results)
  end
end

return memoize

I've been inspired by this question to implement (yet another) flexible memoize function in Lua.

https://github.com/kikito/memoize.lua

Main advantages:

  • Accepts a variable number of arguments
  • Doesn't use tostring; instead, it organizes the cache in a tree structure, using the parameters to traverse it.
  • Works just fine with functions that return multiple values.

Pasting the code here as reference:

local globalCache = {}

local function getFromCache(cache, args)
  local node = cache
  for i=1, #args do
    if not node.children then return {} end
    node = node.children[args[i]]
    if not node then return {} end
  end
  return node.results
end

local function insertInCache(cache, args, results)
  local arg
  local node = cache
  for i=1, #args do
    arg = args[i]
    node.children = node.children or {}
    node.children[arg] = node.children[arg] or {}
    node = node.children[arg]
  end
  node.results = results
end


-- public function

local function memoize(f)
  globalCache[f] = { results = {} }
  return function (...)
    local results = getFromCache( globalCache[f], {...} )

    if #results == 0 then
      results = { f(...) }
      insertInCache(globalCache[f], {...}, results)
    end

    return unpack(results)
  end
end

return memoize
╰◇生如夏花灿烂 2024-07-13 14:15:21

这是一个通用的 C# 3.0 实现,如果它可以帮助的话:(

public static class Memoization
{
    public static Func<T, TResult> Memoize<T, TResult>(this Func<T, TResult> function)
    {
        var cache = new Dictionary<T, TResult>();
        var nullCache = default(TResult);
        var isNullCacheSet = false;
        return  parameter =>
                {
                    TResult value;

                    if (parameter == null && isNullCacheSet)
                    {
                        return nullCache;
                    }

                    if (parameter == null)
                    {
                        nullCache = function(parameter);
                        isNullCacheSet = true;
                        return nullCache;
                    }

                    if (cache.TryGetValue(parameter, out value))
                    {
                        return value;
                    }

                    value = function(parameter);
                    cache.Add(parameter, value);
                    return value;
                };
    }
}

引用自 法语博客文章)

Here is a generic C# 3.0 implementation, if it could help :

public static class Memoization
{
    public static Func<T, TResult> Memoize<T, TResult>(this Func<T, TResult> function)
    {
        var cache = new Dictionary<T, TResult>();
        var nullCache = default(TResult);
        var isNullCacheSet = false;
        return  parameter =>
                {
                    TResult value;

                    if (parameter == null && isNullCacheSet)
                    {
                        return nullCache;
                    }

                    if (parameter == null)
                    {
                        nullCache = function(parameter);
                        isNullCacheSet = true;
                        return nullCache;
                    }

                    if (cache.TryGetValue(parameter, out value))
                    {
                        return value;
                    }

                    value = function(parameter);
                    cache.Add(parameter, value);
                    return value;
                };
    }
}

(Quoted from a french blog article)

夜血缘 2024-07-13 14:15:21

本着用不同语言发布记忆的方式,我想用一个不改变语言的 C++ 示例来回复 @onebyone.livejournal.com。

首先,用于单个参数函数的记忆器:

template <class Result, class Arg, class ResultStore = std::map<Arg, Result> >
class memoizer1{
public:
    template <class F>
    const Result& operator()(F f, const Arg& a){
        typename ResultStore::const_iterator it = memo_.find(a);
        if(it == memo_.end()) {
            it = memo_.insert(make_pair(a, f(a))).first;
        }
        return it->second;
    }
private:
    ResultStore memo_;
};

只需创建记忆器的实例,为其提供函数和参数。 只需确保不要在两个不同的函数之间共享相同的备忘录(但您可以在同一函数的不同实现之间共享它)。

接下来是驱动程序函数和实现。 只有驱动程序功能需要公开
int fib(int); // 司机
int fib_(int); // 实现

已实现:

int fib_(int n){
    ++total_ops;
    if(n == 0 || n == 1) 
        return 1;
    else
        return fib(n-1) + fib(n-2);
}

以及驱动程序,用于

int fib(int n) {
    static memoizer1<int,int> memo;
    return memo(fib_, n);
}

在 codepad.org 上记住 显示输出的永久链接。 测量调用次数以验证正确性。 (在此处插入单元测试...)

这仅记住一个输入函数。 对多个参数或不同参数的概括留给读者作为练习。

In the vein of posting memoization in different languages, i'd like to respond to @onebyone.livejournal.com with a non-language-changing C++ example.

First, a memoizer for single arg functions:

template <class Result, class Arg, class ResultStore = std::map<Arg, Result> >
class memoizer1{
public:
    template <class F>
    const Result& operator()(F f, const Arg& a){
        typename ResultStore::const_iterator it = memo_.find(a);
        if(it == memo_.end()) {
            it = memo_.insert(make_pair(a, f(a))).first;
        }
        return it->second;
    }
private:
    ResultStore memo_;
};

Just create an instance of the memoizer, feed it your function and argument. Just make sure not to share the same memo between two different functions (but you can share it between different implementations of the same function).

Next, a driver functon, and an implementation. only the driver function need be public
int fib(int); // driver
int fib_(int); // implementation

Implemented:

int fib_(int n){
    ++total_ops;
    if(n == 0 || n == 1) 
        return 1;
    else
        return fib(n-1) + fib(n-2);
}

And the driver, to memoize

int fib(int n) {
    static memoizer1<int,int> memo;
    return memo(fib_, n);
}

Permalink showing output on codepad.org. Number of calls is measured to verify correctness. (insert unit test here...)

This only memoizes one input functions. Generalizing for multiple args or varying arguments left as an exercise for the reader.

蹲墙角沉默 2024-07-13 14:15:21

在 Perl 中,通用记忆很容易实现。 Memoize 模块是 Perl 核心的一部分,高度可靠、灵活且易于使用。

其手册页中的示例:

# This is the documentation for Memoize 1.01
use Memoize;
memoize('slow_function');
slow_function(arguments);    # Is faster than it was before

您可以在运行时添加、删除和自定义函数的记忆!您可以为自定义记忆计算提供回调。

Memoize.pm 甚至具有使备忘录缓存持久化的功能,因此不需要在每次调用程序时重新填充!

这是文档:http://perldoc.perl.org/5.8.8/Memoize。 html

In Perl generic memoization is easy to get. The Memoize module is part of the perl core and is highly reliable, flexible, and easy-to-use.

The example from it's manpage:

# This is the documentation for Memoize 1.01
use Memoize;
memoize('slow_function');
slow_function(arguments);    # Is faster than it was before

You can add, remove, and customize memoization of functions at run time! You can provide callbacks for custom memento computation.

Memoize.pm even has facilities for making the memento cache persistent, so it does not need to be re-filled on each invocation of your program!

Here's the documentation: http://perldoc.perl.org/5.8.8/Memoize.html

一指流沙 2024-07-13 14:15:21

扩展这个想法,还可以使用两个输入参数来记忆函数:

function memoize2 (f)
   local cache = {}
   return function (x, y)
             if cache[x..','..y] then
                return cache[x..','..y]
             else
                local z = f(x,y)
                cache[x..','..y] = z
                return z
             end
          end
end

请注意,参数顺序在缓存算法中很重要,因此,如果参数顺序在要记忆的函数中不重要,则获得缓存命中的几率将增加在检查缓存之前对参数进行排序。

但重要的是要注意,某些功能无法通过记忆来获利。 我写了 memoize2 来看看递归欧几里得算法是否可以找到最大公约数可以加快。

function gcd (a, b) 
   if b == 0 then return a end
   return gcd(b, a%b)
end

事实证明,gcd 对记忆的反应不佳。 它所做的计算比缓存算法要便宜得多。 对于大量的数据,它很快就会终止。 一段时间后,缓存变得非常大。 该算法可能是尽可能快的。

Extending the idea, it's also possible to memoize functions with two input parameters:

function memoize2 (f)
   local cache = {}
   return function (x, y)
             if cache[x..','..y] then
                return cache[x..','..y]
             else
                local z = f(x,y)
                cache[x..','..y] = z
                return z
             end
          end
end

Notice that parameter order matters in the caching algorithm, so if parameter order doesn't matter in the functions to be memoized the odds of getting a cache hit would be increased by sorting the parameters before checking the cache.

But it's important to note that some functions can't be profitably memoized. I wrote memoize2 to see if the recursive Euclidean algorithm for finding the greatest common divisor could be sped up.

function gcd (a, b) 
   if b == 0 then return a end
   return gcd(b, a%b)
end

As it turns out, gcd doesn't respond well to memoization. The calculation it does is far less expensive than the caching algorithm. Ever for large numbers, it terminates fairly quickly. After a while, the cache grows very large. This algorithm is probably as fast as it can be.

蓬勃野心 2024-07-13 14:15:21

递归不是必需的。 第 n 个三角形数是 n(n-1)/2,所以...

public int triangle(final int n){
   return n * (n - 1) / 2;
}

Recursion isn't necessary. The nth triangle number is n(n-1)/2, so...

public int triangle(final int n){
   return n * (n - 1) / 2;
}
吐个泡泡 2024-07-13 14:15:21

请不要重复这个。 可以使用 x*(x+1)/2 公式,也可以简单地迭代这些值并随时记住。

int[] memo = new int[n+1];
int sum = 0;
for(int i = 0; i <= n; ++i)
{
  sum+=i;
  memo[i] = sum;
}
return memo[n];

Please don't recurse this. Either use the x*(x+1)/2 formula or simply iterate the values and memoize as you go.

int[] memo = new int[n+1];
int sum = 0;
for(int i = 0; i <= n; ++i)
{
  sum+=i;
  memo[i] = sum;
}
return memo[n];
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文