当前位置：文江博客话题详情

在 Scala 中使用什么类型来存储内存中的可变数据表？

发布于 2024-09-18 05:11:13 字数 145 浏览 13 评论 0原文

每次调用函数时，如果给定参数值集的结果尚未记忆，我想将结果放入内存表中。一列用于存储结果，其他列用于存储参数值。

我如何最好地实现这一点？参数有多种类型，包括一些枚举。

在 C# 中，我通常使用 DataTable。 Scala 中有等效的吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一个人练习一个人 2024-09-25 05:11:13

您可以使用 mutable.Map[TupleN[A1, A2, ..., AN], R] ~~，或者如果内存是一个问题，则使用 WeakHashMap~~[1] 。下面的定义（基于 michid 博客中的记忆代码构建）允许您轻松记忆具有多个参数的函数。例如：

import Memoize._

def reallySlowFn(i: Int, s: String): Int = {
   Thread.sleep(3000)
   i + s.length
}

val memoizedSlowFn = memoize(reallySlowFn _)
memoizedSlowFn(1, "abc") // returns 4 after about 3 seconds
memoizedSlowFn(1, "abc") // returns 4 almost instantly

定义：

/**
 * A memoized unary function.
 *
 * @param f A unary function to memoize
 * @param [T] the argument type
 * @param [R] the return type
 */
class Memoize1[-T, +R](f: T => R) extends (T => R) {
   import scala.collection.mutable
   // map that stores (argument, result) pairs
   private[this] val vals = mutable.Map.empty[T, R]

   // Given an argument x, 
   //   If vals contains x return vals(x).
   //   Otherwise, update vals so that vals(x) == f(x) and return f(x).
   def apply(x: T): R = vals getOrElseUpdate (x, f(x))
}

object Memoize {
   /**
    * Memoize a unary (single-argument) function.
    *
    * @param f the unary function to memoize
    */
   def memoize[T, R](f: T => R): (T => R) = new Memoize1(f)

   /**
    * Memoize a binary (two-argument) function.
    * 
    * @param f the binary function to memoize
    * 
    * This works by turning a function that takes two arguments of type
    * T1 and T2 into a function that takes a single argument of type 
    * (T1, T2), memoizing that "tupled" function, then "untupling" the
    * memoized function.
    */
   def memoize[T1, T2, R](f: (T1, T2) => R): ((T1, T2) => R) = 
      Function.untupled(memoize(f.tupled))

   /**
    * Memoize a ternary (three-argument) function.
    *
    * @param f the ternary function to memoize
    */
   def memoize[T1, T2, T3, R](f: (T1, T2, T3) => R): ((T1, T2, T3) => R) =
      Function.untupled(memoize(f.tupled))

   // ... more memoize methods for higher-arity functions ...

   /**
    * Fixed-point combinator (for memoizing recursive functions).
    */
   def Y[T, R](f: (T => R) => T => R): (T => R) = {
      lazy val yf: (T => R) = memoize(f(yf)(_))
      yf
   }
}

定点组合器 (Memoize.Y) 使得记忆递归函数成为可能：

val fib: BigInt => BigInt = {                         
   def fibRec(f: BigInt => BigInt)(n: BigInt): BigInt = {
      if (n == 0) 1 
      else if (n == 1) 1 
      else (f(n-1) + f(n-2))                           
   }                                                     
   Memoize.Y(fibRec)
}

[1] WeakHashMap 不能很好地用作缓存。请参阅http://www.codeinstructions.com/2008 /09/weakashmap-is-not-cache-understanding.html 和这个相关问题。

You could use a mutable.Map[TupleN[A1, A2, ..., AN], R] ~~, or if memory is a concern, a WeakHashMap~~[1]. The definitions below (built on the memoization code from michid's blog) allow you to easily memoize functions with multiple arguments. For example:

import Memoize._

def reallySlowFn(i: Int, s: String): Int = {
   Thread.sleep(3000)
   i + s.length
}

val memoizedSlowFn = memoize(reallySlowFn _)
memoizedSlowFn(1, "abc") // returns 4 after about 3 seconds
memoizedSlowFn(1, "abc") // returns 4 almost instantly

Definitions:

/**
 * A memoized unary function.
 *
 * @param f A unary function to memoize
 * @param [T] the argument type
 * @param [R] the return type
 */
class Memoize1[-T, +R](f: T => R) extends (T => R) {
   import scala.collection.mutable
   // map that stores (argument, result) pairs
   private[this] val vals = mutable.Map.empty[T, R]

   // Given an argument x, 
   //   If vals contains x return vals(x).
   //   Otherwise, update vals so that vals(x) == f(x) and return f(x).
   def apply(x: T): R = vals getOrElseUpdate (x, f(x))
}

object Memoize {
   /**
    * Memoize a unary (single-argument) function.
    *
    * @param f the unary function to memoize
    */
   def memoize[T, R](f: T => R): (T => R) = new Memoize1(f)

   /**
    * Memoize a binary (two-argument) function.
    * 
    * @param f the binary function to memoize
    * 
    * This works by turning a function that takes two arguments of type
    * T1 and T2 into a function that takes a single argument of type 
    * (T1, T2), memoizing that "tupled" function, then "untupling" the
    * memoized function.
    */
   def memoize[T1, T2, R](f: (T1, T2) => R): ((T1, T2) => R) = 
      Function.untupled(memoize(f.tupled))

   /**
    * Memoize a ternary (three-argument) function.
    *
    * @param f the ternary function to memoize
    */
   def memoize[T1, T2, T3, R](f: (T1, T2, T3) => R): ((T1, T2, T3) => R) =
      Function.untupled(memoize(f.tupled))

   // ... more memoize methods for higher-arity functions ...

   /**
    * Fixed-point combinator (for memoizing recursive functions).
    */
   def Y[T, R](f: (T => R) => T => R): (T => R) = {
      lazy val yf: (T => R) = memoize(f(yf)(_))
      yf
   }
}

The fixed-point combinator (Memoize.Y) makes it possible to memoize recursive functions:

val fib: BigInt => BigInt = {                         
   def fibRec(f: BigInt => BigInt)(n: BigInt): BigInt = {
      if (n == 0) 1 
      else if (n == 1) 1 
      else (f(n-1) + f(n-2))                           
   }                                                     
   Memoize.Y(fibRec)
}

[1] WeakHashMap does not work well as a cache. See http://www.codeinstructions.com/2008/09/weakhashmap-is-not-cache-understanding.html and this related question.

回复收藏 0 原文

山有枢 2024-09-25 05:11:13

anovstrup 建议的使用可变 Map 的版本与 C# 中的版本基本相同，因此易于使用。

但如果您愿意，也可以使用更实用的风格。它使用不可变的映射，充当一种累加器。使用元组（而不是示例中的 Int）作为键的工作方式与可变情况下完全相同。

def fib(n:Int) = fibM(n, Map(0->1, 1->1))._1

def fibM(n:Int, m:Map[Int,Int]):(Int,Map[Int,Int]) = m.get(n) match {
   case Some(f) => (f, m)
   case None => val (f_1,m1) = fibM(n-1,m)
                val (f_2,m2) = fibM(n-2,m1)
                val f = f_1+f_2
                (f, m2 + (n -> f))   
}

当然，这有点复杂，但却是一种有用的技术（请注意，上面的代码旨在清晰，而不是速度）。

The version suggested by anovstrup using a mutable Map is basically the same as in C#, and therefore easy to use.

But if you want you can also use a more functional style as well. It uses immutable maps, which act as a kind of accumalator. Having Tuples (instead of Int in the example) as keys works exactly as in the mutable case.

def fib(n:Int) = fibM(n, Map(0->1, 1->1))._1

def fibM(n:Int, m:Map[Int,Int]):(Int,Map[Int,Int]) = m.get(n) match {
   case Some(f) => (f, m)
   case None => val (f_1,m1) = fibM(n-1,m)
                val (f_2,m2) = fibM(n-2,m1)
                val f = f_1+f_2
                (f, m2 + (n -> f))   
}

Of course this is a little bit more complicated, but a useful technique to know (note that the code above aims for clarity, not for speed).

回复收藏 0 原文

黯然 2024-09-25 05:11:13

作为这个主题的新手，我无法完全理解给出的任何示例（但无论如何还是要感谢）。谨此，我会针对有人来到这里具有相同级别和相同问题的情况提出我自己的解决方案。我认为我的代码对于任何只拥有非常非常基础的 Scala 知识<的人来说都是非常清晰的/a>.



def MyFunction(dt : DateTime, param : Int) : Double
{
  val argsTuple = (dt, param)
  if(Memo.contains(argsTuple)) Memo(argsTuple) else Memoize(dt, param, MyRawFunction(dt, param))
}

def MyRawFunction(dt : DateTime, param : Int) : Double
{
  1.0 // A heavy calculation/querying here
}

def Memoize(dt : DateTime, param : Int, result : Double) : Double
{
  Memo += (dt, param) -> result
  result
}

val Memo = new  scala.collection.mutable.HashMap[(DateTime, Int), Double]

工作完美。如果我错过了什么，我将不胜感激。

Being a newbie in this subject, I could fully understand none of the examples given (but would like to thank anyway). Respectfully, I'd present my own solution for the case some one comes here having a same level and same problem. I think my code can be crystal clear for anybody having just the very-very basic Scala knowledge.



def MyFunction(dt : DateTime, param : Int) : Double
{
  val argsTuple = (dt, param)
  if(Memo.contains(argsTuple)) Memo(argsTuple) else Memoize(dt, param, MyRawFunction(dt, param))
}

def MyRawFunction(dt : DateTime, param : Int) : Double
{
  1.0 // A heavy calculation/querying here
}

def Memoize(dt : DateTime, param : Int, result : Double) : Double
{
  Memo += (dt, param) -> result
  result
}

val Memo = new  scala.collection.mutable.HashMap[(DateTime, Int), Double]

Works perfectly. I'd appreciate critique If I've missed something.

回复收藏 0 原文

街角卖回忆 2024-09-25 05:11:13

当使用可变映射进行记忆时，应记住这会导致典型的并发问题，例如在写入尚未完成时执行获取。然而，线程安全的记忆尝试建议这样做，即使不是没有价值，也没有什么价值。

以下线程安全代码创建一个记忆化的斐波那契函数，启动几个调用该函数的线程（名称从“a”到“d”）。尝试几次代码（在 REPL 中），可以轻松地看到 f(2) set 被打印多次。这意味着线程 A 已经启动了 f(2) 的计算，但线程 B 完全不知道并开始了自己的计算副本。这种无知在缓存的构建阶段非常普遍，因为所有线程都看不到已建立的子解决方案，并且会进入 else 子句。

object ScalaMemoizationMultithread {

  // do not use case class as there is a mutable member here
  class Memo[-T, +R](f: T => R) extends (T => R) {
    // don't even know what would happen if immutable.Map used in a multithreading context
    private[this] val cache = new java.util.concurrent.ConcurrentHashMap[T, R]
    def apply(x: T): R =
      // no synchronized needed as there is no removal during memoization
      if (cache containsKey x) {
        Console.println(Thread.currentThread().getName() + ": f(" + x + ") get")
        cache.get(x)
      } else {
        val res = f(x)
        Console.println(Thread.currentThread().getName() + ": f(" + x + ") set")
        cache.putIfAbsent(x, res) // atomic
        res
      }
  }

  object Memo {
    def apply[T, R](f: T => R): T => R = new Memo(f)

    def Y[T, R](F: (T => R) => T => R): T => R = {
      lazy val yf: T => R = Memo(F(yf)(_))
      yf
    }
  }

  val fibonacci: Int => BigInt = {
    def fiboF(f: Int => BigInt)(n: Int): BigInt = {
      if (n <= 0) 1
      else if (n == 1) 1
      else f(n - 1) + f(n - 2)
    }

    Memo.Y(fiboF)
  }

  def main(args: Array[String]) = {
    ('a' to 'd').foreach(ch =>
      new Thread(new Runnable() {
        def run() {
          import scala.util.Random
          val rand = new Random
          (1 to 2).foreach(_ => {
            Thread.currentThread().setName("Thread " + ch)
            fibonacci(5)
          })
        }
      }).start)
  }
}

When using mutable map for memoization, one shall keep in mind that this would cause typical concurrency problems, e.g. doing a get when a write has not completed yet. However, thread-safe attemp of memoization suggests to do so it's of little value if not none.

The following thread-safe code creates a memoized fibonacci function, initiates a couple of threads (named from 'a' through to 'd') that make calls to it. Try the code a couple of times (in REPL), one can easily see f(2) set gets printed more than once. This means a thread A has initiated the calculation of f(2) but Thread B has totally no idea of it and starts its own copy of calculation. Such ignorance is so pervasive at the constructing phase of the cache, because all threads see no sub solution established and would enter the else clause.

object ScalaMemoizationMultithread {

  // do not use case class as there is a mutable member here
  class Memo[-T, +R](f: T => R) extends (T => R) {
    // don't even know what would happen if immutable.Map used in a multithreading context
    private[this] val cache = new java.util.concurrent.ConcurrentHashMap[T, R]
    def apply(x: T): R =
      // no synchronized needed as there is no removal during memoization
      if (cache containsKey x) {
        Console.println(Thread.currentThread().getName() + ": f(" + x + ") get")
        cache.get(x)
      } else {
        val res = f(x)
        Console.println(Thread.currentThread().getName() + ": f(" + x + ") set")
        cache.putIfAbsent(x, res) // atomic
        res
      }
  }

  object Memo {
    def apply[T, R](f: T => R): T => R = new Memo(f)

    def Y[T, R](F: (T => R) => T => R): T => R = {
      lazy val yf: T => R = Memo(F(yf)(_))
      yf
    }
  }

  val fibonacci: Int => BigInt = {
    def fiboF(f: Int => BigInt)(n: Int): BigInt = {
      if (n <= 0) 1
      else if (n == 1) 1
      else f(n - 1) + f(n - 2)
    }

    Memo.Y(fiboF)
  }

  def main(args: Array[String]) = {
    ('a' to 'd').foreach(ch =>
      new Thread(new Runnable() {
        def run() {
          import scala.util.Random
          val rand = new Random
          (1 to 2).foreach(_ => {
            Thread.currentThread().setName("Thread " + ch)
            fibonacci(5)
          })
        }
      }).start)
  }
}

回复收藏 0 原文

全部不再 2024-09-25 05:11:13

除了Landei的回答之外，我还想建议在Scala中采用自下而上（非记忆化）的方式进行DP是可能的，其核心思想是使用foldLeft(s)。

计算斐波那契数的示例

  def fibo(n: Int) = (1 to n).foldLeft((0, 1)) {
    (acc, i) => (acc._2, acc._1 + acc._2)
  }._1

最长递增子序列的示例

def longestIncrSubseq[T](xs: List[T])(implicit ord: Ordering[T]) = {
  xs.foldLeft(List[(Int, List[T])]()) {
    (memo, x) =>
      if (memo.isEmpty) List((1, List(x)))
      else {
        val resultIfEndsAtCurr = (memo, xs).zipped map {
          (tp, y) =>
            val len = tp._1
            val seq = tp._2
            if (ord.lteq(y, x)) { // current is greater than the previous end
              (len + 1, x :: seq) // reversely recorded to avoid O(n)
            } else {
              (1, List(x)) // start over
            }
        }
        memo :+ resultIfEndsAtCurr.maxBy(_._1)
      }
  }.maxBy(_._1)._2.reverse
}

In addition to Landei's answer, I also want to suggest the bottom-up (non-memoization) way of doing DP in Scala is possible, and the core idea is to use foldLeft(s).

Example for computing Fibonacci numbers

  def fibo(n: Int) = (1 to n).foldLeft((0, 1)) {
    (acc, i) => (acc._2, acc._1 + acc._2)
  }._1

Example for longest increasing subsequence

def longestIncrSubseq[T](xs: List[T])(implicit ord: Ordering[T]) = {
  xs.foldLeft(List[(Int, List[T])]()) {
    (memo, x) =>
      if (memo.isEmpty) List((1, List(x)))
      else {
        val resultIfEndsAtCurr = (memo, xs).zipped map {
          (tp, y) =>
            val len = tp._1
            val seq = tp._2
            if (ord.lteq(y, x)) { // current is greater than the previous end
              (len + 1, x :: seq) // reversely recorded to avoid O(n)
            } else {
              (1, List(x)) // start over
            }
        }
        memo :+ resultIfEndsAtCurr.maxBy(_._1)
      }
  }.maxBy(_._1)._2.reverse
}

回复收藏 0 原文

~没有更多了~