Scala:如何合并地图集合

发布于 2024-07-30 19:56:05 字数 485 浏览 11 评论 0原文

我有一个 Map[String, Double] 列表,我想将它们的内容合并到一个 Map[String, Double] 中。 我应该如何以惯用的方式做到这一点? 我想我应该能够通过折叠来做到这一点。 比如:

val newMap = Map[String, Double]() /: listOfMaps { (accumulator, m) => ... }

此外,我想以通用的方式处理按键冲突。 也就是说,如果我向已存在的映射添加一个键,我应该能够指定一个返回 Double 的函数(在本例中)并获取该键的现有值以及我要添加的值。 如果映射中尚不存在该键,则只需添加它并保持其值不变。

在我的具体情况下,我想构建一个 Map[String, Double] ,这样如果映射已经包含一个键,那么 Double 将被添加到现有的映射值中。

我正在我的特定代码中使用可变映射,但如果可能的话,我对更通用的解决方案感兴趣。

I have a List of Map[String, Double], and I'd like to merge their contents into a single Map[String, Double]. How should I do this in an idiomatic way? I imagine that I should be able to do this with a fold. Something like:

val newMap = Map[String, Double]() /: listOfMaps { (accumulator, m) => ... }

Furthermore, I'd like to handle key collisions in a generic way. That is, if I add a key to the map that already exists, I should be able to specify a function that returns a Double (in this case) and takes the existing value for that key, plus the value I'm trying to add. If the key does not yet exist in the map, then just add it and its value unaltered.

In my specific case I'd like to build a single Map[String, Double] such that if the map already contains a key, then the Double will be added to the existing map value.

I'm working with mutable maps in my specific code, but I'm interested in more generic solutions, if possible.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

你又不是我 2024-08-06 19:56:05

好吧,你可以这样做:

mapList reduce (_ ++ _)

除了碰撞的特殊要求。

既然您确实有这种特殊要求,也许最好的做法是这样做(2.8):

def combine(m1: Map, m2: Map): Map = {
  val k1 = Set(m1.keysIterator.toList: _*)
  val k2 = Set(m2.keysIterator.toList: _*)
  val intersection = k1 & k2

  val r1 = for(key <- intersection) yield (key -> (m1(key) + m2(key)))
  val r2 = m1.filterKeys(!intersection.contains(_)) ++ m2.filterKeys(!intersection.contains(_)) 
  r2 ++ r1
}

然后您可以通过 Pimp My Library 模式将此方法添加到映射类,并在原始示例中使用它而不是“++”:

class CombiningMap(m1: Map[Symbol, Double]) {
  def combine(m2: Map[Symbol, Double]) = {
    val k1 = Set(m1.keysIterator.toList: _*)
    val k2 = Set(m2.keysIterator.toList: _*)
    val intersection = k1 & k2
    val r1 = for(key <- intersection) yield (key -> (m1(key) + m2(key)))
    val r2 = m1.filterKeys(!intersection.contains(_)) ++ m2.filterKeys(!intersection.contains(_))
    r2 ++ r1
  }
}

// Then use this:
implicit def toCombining(m: Map[Symbol, Double]) = new CombiningMap(m)

// And finish with:
mapList reduce (_ combine _)

虽然这是在 2.8 中编写的,所以 keysIterator 在 2.7 中变为 keys,可能需要编写 filterKeysfiltermap而言,&变成了**,以此类推,不应该太不同了。

Well, you could do:

mapList reduce (_ ++ _)

except for the special requirement for collision.

Since you do have that special requirement, perhaps the best would be doing something like this (2.8):

def combine(m1: Map, m2: Map): Map = {
  val k1 = Set(m1.keysIterator.toList: _*)
  val k2 = Set(m2.keysIterator.toList: _*)
  val intersection = k1 & k2

  val r1 = for(key <- intersection) yield (key -> (m1(key) + m2(key)))
  val r2 = m1.filterKeys(!intersection.contains(_)) ++ m2.filterKeys(!intersection.contains(_)) 
  r2 ++ r1
}

You can then add this method to the map class through the Pimp My Library pattern, and use it in the original example instead of "++":

class CombiningMap(m1: Map[Symbol, Double]) {
  def combine(m2: Map[Symbol, Double]) = {
    val k1 = Set(m1.keysIterator.toList: _*)
    val k2 = Set(m2.keysIterator.toList: _*)
    val intersection = k1 & k2
    val r1 = for(key <- intersection) yield (key -> (m1(key) + m2(key)))
    val r2 = m1.filterKeys(!intersection.contains(_)) ++ m2.filterKeys(!intersection.contains(_))
    r2 ++ r1
  }
}

// Then use this:
implicit def toCombining(m: Map[Symbol, Double]) = new CombiningMap(m)

// And finish with:
mapList reduce (_ combine _)

While this was written in 2.8, so keysIterator becomes keys for 2.7, filterKeys might need to be written in terms of filter and map, & becomes **, and so on, it shouldn't be too different.

痴情 2024-08-06 19:56:05

这个怎么样:

def mergeMap[A, B](ms: List[Map[A, B]])(f: (B, B) => B): Map[A, B] =
  (Map[A, B]() /: (for (m <- ms; kv <- m) yield kv)) { (a, kv) =>
    a + (if (a.contains(kv._1)) kv._1 -> f(a(kv._1), kv._2) else kv)
  }

val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
val mm = mergeMap(ms)((v1, v2) => v1 + v2)

println(mm) // prints Map(hello -> 5.5, world -> 2.2, goodbye -> 3.3)

它适用于 2.7.5 和 2.8.0。

How about this one:

def mergeMap[A, B](ms: List[Map[A, B]])(f: (B, B) => B): Map[A, B] =
  (Map[A, B]() /: (for (m <- ms; kv <- m) yield kv)) { (a, kv) =>
    a + (if (a.contains(kv._1)) kv._1 -> f(a(kv._1), kv._2) else kv)
  }

val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
val mm = mergeMap(ms)((v1, v2) => v1 + v2)

println(mm) // prints Map(hello -> 5.5, world -> 2.2, goodbye -> 3.3)

And it works in both 2.7.5 and 2.8.0.

秋千易 2024-08-06 19:56:05

我很惊讶还没有人提出这个解决方案:

myListOfMaps.flatten.toMap

完全满足您的需要:

  1. 将列表合并到单个地图
  2. 删除任何重复的键

示例:

scala> List(Map('a -> 1), Map('b -> 2), Map('c -> 3), Map('a -> 4, 'b -> 5)).flatten.toMap
res7: scala.collection.immutable.Map[Symbol,Int] = Map('a -> 4, 'b -> 5, 'c -> 3)

flatten 将地图列表变成平面列表元组,toMap 将元组列表转换为删除所有重复键的映射

I'm surprised no one's come up with this solution yet:

myListOfMaps.flatten.toMap

Does exactly what you need:

  1. Merges the list to a single Map
  2. Weeds out any duplicate keys

Example:

scala> List(Map('a -> 1), Map('b -> 2), Map('c -> 3), Map('a -> 4, 'b -> 5)).flatten.toMap
res7: scala.collection.immutable.Map[Symbol,Int] = Map('a -> 4, 'b -> 5, 'c -> 3)

flatten turns the list of maps into a flat list of tuples, toMap turns the list of tuples into a map with all the duplicate keys removed

温柔一刀 2024-08-06 19:56:05

从 Scala 2.13 开始,另一个处理重复键并且仅基于标准库的解决方案包括合并Map 在应用新的 groupMapReduce 运算符(顾名思义)相当于 groupBy ,后跟分组值的映射和归约步骤:

List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
  .flatten
  .groupMapReduce(_._1)(_._2)(_ + _)
// Map("world" -> 2.2, "goodbye" -> 3.3, "hello" -> 5.5)

此:

  • 展平s(连接)映射作为元组序列 (List(("hello", 1.1), ("world", 2.2), ("goodbye", 3.3), ("hello ", 4.4))),它将所有键/值(甚至重复的键)保存

  • group 元素基于其第一个元组部分 (_._1)(groupMapReduce 的组部分)

  • map 将值分组到其第二个元组部分 (_._2)(组MapReduce 的映射部分)

  • reduce 映射的分组值 ( _+_)通过求它们的总和(但它可以是任何reduce:(T,T)=> T 函数)(减少 groupMapReduce 的一部分)


groupMapReduce 步骤可以被视为 一次性版本相当于:

list.groupBy(_._1).mapValues(_.map(_._2).reduce(_ + _))

Starting Scala 2.13, another solution which handles duplicate keys and is only based on the standard library consists in merging the Maps as sequences (flatten) before applying the new groupMapReduce operator which (as its name suggests) is an equivalent of a groupBy followed by a mapping and a reduce step of grouped values:

List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
  .flatten
  .groupMapReduce(_._1)(_._2)(_ + _)
// Map("world" -> 2.2, "goodbye" -> 3.3, "hello" -> 5.5)

This:

  • flattens (concatenates) the maps as a sequence of tuples (List(("hello", 1.1), ("world", 2.2), ("goodbye", 3.3), ("hello", 4.4))), which keeps all key/values (even duplicate keys)

  • groups elements based on their first tuple part (_._1) (group part of groupMapReduce)

  • maps grouped values to their second tuple part (_._2) (map part of groupMapReduce)

  • reduces mapped grouped values (_+_) by taking their sum (but it can be any reduce: (T, T) => T function) (reduce part of groupMapReduce)


The groupMapReduce step can be seen as a one-pass version equivalent of:

list.groupBy(_._1).mapValues(_.map(_._2).reduce(_ + _))
琉璃繁缕 2024-08-06 19:56:05

有趣的是,对此进行了一些思考,我得到了以下内容(在 2.7.5 上):

一般地图:

   def mergeMaps[A,B](collisionFunc: (B,B) => B)(listOfMaps: Seq[scala.collection.Map[A,B]]): Map[A, B] = {
    listOfMaps.foldLeft(Map[A, B]()) { (m, s) =>
      Map(
        s.projection.map { pair =>
        if (m contains pair._1)
          (pair._1, collisionFunc(m(pair._1), pair._2))
        else
          pair
      }.force.toList:_*)
    }
  }

但是伙计,这对于投影、强制和 toList 之类的东西来说是可怕的。 另一个问题:在折叠内处理这个问题的更好方法是什么?

对于可变映射,这就是我在代码中处理的内容,并且通过一个不太通用的解决方案,我得到了这个:

def mergeMaps[A,B](collisionFunc: (B,B) => B)(listOfMaps: List[mutable.Map[A,B]]): mutable.Map[A, B] = {
    listOfMaps.foldLeft(mutable.Map[A,B]()) {
      (m, s) =>
      for (k <- s.keys) {
        if (m contains k)
          m(k) = collisionFunc(m(k), s(k))
        else
          m(k) = s(k)
      }
      m
    }
  }

这看起来有点干净,但只适用于编写的可变映射。 有趣的是,我首先尝试使用 /: 而不是 FoldLeft 进行上述操作(在我提出问题之前),但我遇到了类型错误。 我认为 /: 和 FoldLeft 基本上是等价的,但编译器一直抱怨我需要 (m, s) 的显式类型。 那是怎么回事?

Interesting, noodling around with this a bit, I got the following (on 2.7.5):

General Maps:

   def mergeMaps[A,B](collisionFunc: (B,B) => B)(listOfMaps: Seq[scala.collection.Map[A,B]]): Map[A, B] = {
    listOfMaps.foldLeft(Map[A, B]()) { (m, s) =>
      Map(
        s.projection.map { pair =>
        if (m contains pair._1)
          (pair._1, collisionFunc(m(pair._1), pair._2))
        else
          pair
      }.force.toList:_*)
    }
  }

But man, that is hideous with the projection and forcing and toList and whatnot. Separate question: what's a better way to deal with that within the fold?

For mutable Maps, which is what I was dealing with in my code, and with a less general solution, I got this:

def mergeMaps[A,B](collisionFunc: (B,B) => B)(listOfMaps: List[mutable.Map[A,B]]): mutable.Map[A, B] = {
    listOfMaps.foldLeft(mutable.Map[A,B]()) {
      (m, s) =>
      for (k <- s.keys) {
        if (m contains k)
          m(k) = collisionFunc(m(k), s(k))
        else
          m(k) = s(k)
      }
      m
    }
  }

That seems a little bit cleaner, but will only work with mutable Maps as it's written. Interestingly, I first tried the above (before I asked the question) using /: instead of foldLeft, but I was getting type errors. I thought /: and foldLeft were basically equivalent, but the compiler kept complaining that I needed explicit types for (m, s). What's up with that?

花辞树 2024-08-06 19:56:05

我很快读了这个问题,所以我不确定我是否遗漏了一些东西(比如它必须适用于 2.7.x 或没有 scalaz):

import scalaz._
import Scalaz._
val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms.reduceLeft(_ |+| _)
// returns Map(goodbye -> 3.3, hello -> 5.5, world -> 2.2)

您可以更改 Double 的幺半群定义并获得另一种累积值的方法,在这里获取最大值:

implicit val dbsg: Semigroup[Double] = semigroup((a,b) => math.max(a,b))
ms.reduceLeft(_ |+| _)
// returns Map(goodbye -> 3.3, hello -> 4.4, world -> 2.2)

I reading this question quickly so I'm not sure if I'm missing something (like it has to work for 2.7.x or no scalaz):

import scalaz._
import Scalaz._
val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms.reduceLeft(_ |+| _)
// returns Map(goodbye -> 3.3, hello -> 5.5, world -> 2.2)

You can change the monoid definition for Double and get another way to accumulate the values, here getting the max:

implicit val dbsg: Semigroup[Double] = semigroup((a,b) => math.max(a,b))
ms.reduceLeft(_ |+| _)
// returns Map(goodbye -> 3.3, hello -> 4.4, world -> 2.2)
错々过的事 2024-08-06 19:56:05

我写了一篇关于此的博客文章,请查看:

http://www.nimrodstech.com/ scala-map-merge/

基本上使用 scalaz semi group 你可以很容易地实现这一点,

看起来像:

  import scalaz.Scalaz._
  listOfMaps reduce(_ |+| _)

I wrote a blog post about this , check it out :

http://www.nimrodstech.com/scala-map-merge/

basically using scalaz semi group you can achieve this pretty easily

would look something like :

  import scalaz.Scalaz._
  listOfMaps reduce(_ |+| _)
小嗲 2024-08-06 19:56:05

一个 oneliner helper-func,其用法几乎与使用 scalaz 一样干净:

def mergeMaps[K,V](m1: Map[K,V], m2: Map[K,V])(f: (V,V) => V): Map[K,V] =
    (m1 -- m2.keySet) ++ (m2 -- m1.keySet) ++ (for (k <- m1.keySet & m2.keySet) yield { k -> f(m1(k), m2(k)) })

val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms.reduceLeft(mergeMaps(_,_)(_ + _))
// returns Map(goodbye -> 3.3, hello -> 5.5, world -> 2.2)

为了最终的可读性,将其包装在隐式自定义类型中:

class MyMap[K,V](m1: Map[K,V]) {
    def merge(m2: Map[K,V])(f: (V,V) => V) =
    (m1 -- m2.keySet) ++ (m2 -- m1.keySet) ++ (for (k <- m1.keySet & m2.keySet) yield { k -> f(m1(k), m2(k)) })
}
implicit def toMyMap[K,V](m: Map[K,V]) = new MyMap(m)

val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms reduceLeft { _.merge(_)(_ + _) } 

a oneliner helper-func, whose usage reads almost as clean as using scalaz:

def mergeMaps[K,V](m1: Map[K,V], m2: Map[K,V])(f: (V,V) => V): Map[K,V] =
    (m1 -- m2.keySet) ++ (m2 -- m1.keySet) ++ (for (k <- m1.keySet & m2.keySet) yield { k -> f(m1(k), m2(k)) })

val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms.reduceLeft(mergeMaps(_,_)(_ + _))
// returns Map(goodbye -> 3.3, hello -> 5.5, world -> 2.2)

for ultimate readability wrap it in an implicit custom type:

class MyMap[K,V](m1: Map[K,V]) {
    def merge(m2: Map[K,V])(f: (V,V) => V) =
    (m1 -- m2.keySet) ++ (m2 -- m1.keySet) ++ (for (k <- m1.keySet & m2.keySet) yield { k -> f(m1(k), m2(k)) })
}
implicit def toMyMap[K,V](m: Map[K,V]) = new MyMap(m)

val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms reduceLeft { _.merge(_)(_ + _) } 
恋你朝朝暮暮 2024-08-06 19:56:05
def mergeMap[A, B](ms: List[Map[A, B]])(f: (B, B) => B): Map[A, B] = {
  ms.flatten.foldLeft(Map[A, B]()) { case (acc, (k, v)) =>
    acc + (if (acc.contains(k)) k -> f(acc(k), v) else (k, v))
  }
}
def mergeMap[A, B](ms: List[Map[A, B]])(f: (B, B) => B): Map[A, B] = {
  ms.flatten.foldLeft(Map[A, B]()) { case (acc, (k, v)) =>
    acc + (if (acc.contains(k)) k -> f(acc(k), v) else (k, v))
  }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文