按键合并地图

发布于 2024-12-09 19:17:26 字数 371 浏览 1 评论 0原文

假设我有两个映射:

val a = Map(1 -> "one", 2 -> "two", 3 -> "three")
val b = Map(1 -> "un", 2 -> "deux", 3 -> "trois")

我想按键合并这些映射,应用一些函数来收集值(在这种特殊情况下,我想将它们收集到一个序列中,给出:

val c = Map(1 -> Seq("one", "un"), 2 -> Seq("two", "deux"), 3 -> Seq("three", "trois"))

感觉应该有一种很好的、​​惯用的方式做这个。

Say I have two maps:

val a = Map(1 -> "one", 2 -> "two", 3 -> "three")
val b = Map(1 -> "un", 2 -> "deux", 3 -> "trois")

I want to merge these maps by key, applying some function to collect the values (in this particular case I want to collect them into a seq, giving:

val c = Map(1 -> Seq("one", "un"), 2 -> Seq("two", "deux"), 3 -> Seq("three", "trois"))

It feels like there should be a nice, idiomatic way of doing this.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

檐上三寸雪 2024-12-16 19:17:26

scala.collection.immutable.IntMap有一个 intersectionWith 方法,它可以精确地执行您想要的操作(我相信):

import scala.collection.immutable.IntMap

val a = IntMap(1 -> "one", 2 -> "two", 3 -> "three", 4 -> "four")
val b = IntMap(1 -> "un", 2 -> "deux", 3 -> "trois")

val merged = a.intersectionWith(b, (_, av, bv: String) => Seq(av, bv))

这为您提供了 IntMap(1 -> List(one, un), 2 -> List(two, deux) ), 3 -> 列表(三,三))。请注意,它正确地忽略了仅出现在 a 中的键。

附带说明:我经常发现自己想要来自 Scala 中的 Haskell Data.Map。我认为没有任何原则性的理由表明它们应该只在 IntMap 上可用,而不是在基本 collection.Map 特征中可用。

scala.collection.immutable.IntMap has an intersectionWith method that does precisely what you want (I believe):

import scala.collection.immutable.IntMap

val a = IntMap(1 -> "one", 2 -> "two", 3 -> "three", 4 -> "four")
val b = IntMap(1 -> "un", 2 -> "deux", 3 -> "trois")

val merged = a.intersectionWith(b, (_, av, bv: String) => Seq(av, bv))

This gives you IntMap(1 -> List(one, un), 2 -> List(two, deux), 3 -> List(three, trois)). Note that it correctly ignores the key that only occurs in a.

As a side note: I've often found myself wanting the unionWith, intersectionWith, etc. functions from Haskell's Data.Map in Scala. I don't think there's any principled reason that they should only be available on IntMap, instead of in the base collection.Map trait.

舟遥客 2024-12-16 19:17:26
val a = Map(1 -> "one", 2 -> "two", 3 -> "three")
val b = Map(1 -> "un", 2 -> "deux", 3 -> "trois")

val c = a.toList ++ b.toList
val d = c.groupBy(_._1).map{case(k, v) => k -> v.map(_._2).toSeq}
//res0: scala.collection.immutable.Map[Int,Seq[java.lang.String]] =
        //Map((2,List(two, deux)), (1,List(one, un), (3,List(three, trois)))
val a = Map(1 -> "one", 2 -> "two", 3 -> "three")
val b = Map(1 -> "un", 2 -> "deux", 3 -> "trois")

val c = a.toList ++ b.toList
val d = c.groupBy(_._1).map{case(k, v) => k -> v.map(_._2).toSeq}
//res0: scala.collection.immutable.Map[Int,Seq[java.lang.String]] =
        //Map((2,List(two, deux)), (1,List(one, un), (3,List(three, trois)))
迷荒 2024-12-16 19:17:26

Scalaz 为任何可用 Semigroup[A] 的类型 A 添加了方法 |+|

如果您映射了 Map,以便每个值都是单元素序列,那么您可以非常简单地使用它:

scala> a.mapValues(Seq(_)) |+| b.mapValues(Seq(_))
res3: scala.collection.immutable.Map[Int,Seq[java.lang.String]] = Map(1 -> List(one, un), 2 -> List(two, deux), 3 -> List(three, trois))

Scalaz adds a method |+| for any type A for which a Semigroup[A] is available.

If you mapped your Maps so that each value was a single-element sequence, then you could use this quite simply:

scala> a.mapValues(Seq(_)) |+| b.mapValues(Seq(_))
res3: scala.collection.immutable.Map[Int,Seq[java.lang.String]] = Map(1 -> List(one, un), 2 -> List(two, deux), 3 -> List(three, trois))
罗罗贝儿 2024-12-16 19:17:26

从 Scala 2.13 开始,您可以使用 groupMap(顾名思义)相当于 groupBy 后跟 map 的值:

// val map1 = Map(1 -> "one", 2 -> "two",  3 -> "three")
// val map2 = Map(1 -> "un",  2 -> "deux", 3 -> "trois")
(map1.toSeq ++ map2).groupMap(_._1)(_._2)
// Map(1 -> List("one", "un"), 2 -> List("two", "deux"), 3 -> List("three", "trois"))

这样:

  • 将两个映射连接为元组序列 (List((1, "one"), (2, "two"), (3, "third")))。为了简洁起见,map2隐式转换为Seq以与map1.toSeq的类型对齐 - 但您可以选择使用 map2.toSeq 使其显式化。

  • group 元素基于其第一个元组部分 (_._1)(groupMap 的组部分)

  • map 将值分组到其第二个元组部分(_._2) (映射组Map的一部分)

Starting Scala 2.13, you can use groupMap which (as its name suggests) is an equivalent of a groupBy followed by map on values:

// val map1 = Map(1 -> "one", 2 -> "two",  3 -> "three")
// val map2 = Map(1 -> "un",  2 -> "deux", 3 -> "trois")
(map1.toSeq ++ map2).groupMap(_._1)(_._2)
// Map(1 -> List("one", "un"), 2 -> List("two", "deux"), 3 -> List("three", "trois"))

This:

  • Concatenates the two maps as a sequence of tuples (List((1, "one"), (2, "two"), (3, "three"))). For conciseness, map2 is implicitly converted to Seq to align with map1.toSeq's type - but you could choose to make it explicit by using map2.toSeq.

  • groups elements based on their first tuple part (_._1) (group part of groupMap)

  • maps grouped values to their second tuple part (_._2) (map part of groupMap)

风月客 2024-12-16 19:17:26
val fr = Map(1 -> "one", 2 -> "two", 3 -> "three")
val en = Map(1 -> "un", 2 -> "deux", 3 -> "trois")

def innerJoin[K, A, B](m1: Map[K, A], m2: Map[K, B]): Map[K, (A, B)] = {
  m1.flatMap{ case (k, a) => 
    m2.get(k).map(b => Map((k, (a, b)))).getOrElse(Map.empty[K, (A, B)])
  }
}

innerJoin(fr, en) // Map(1 -> ("one", "un"), 2 -> ("two", "deux"), 3 -> ("three", "trois")): Map[Int, (String, String)]
val fr = Map(1 -> "one", 2 -> "two", 3 -> "three")
val en = Map(1 -> "un", 2 -> "deux", 3 -> "trois")

def innerJoin[K, A, B](m1: Map[K, A], m2: Map[K, B]): Map[K, (A, B)] = {
  m1.flatMap{ case (k, a) => 
    m2.get(k).map(b => Map((k, (a, b)))).getOrElse(Map.empty[K, (A, B)])
  }
}

innerJoin(fr, en) // Map(1 -> ("one", "un"), 2 -> ("two", "deux"), 3 -> ("three", "trois")): Map[Int, (String, String)]
四叶草在未来唯美盛开 2024-12-16 19:17:26

在寻找其他解决方案之前,这是我的第一个方法:

for (x <- a) yield 
  x._1 -> Seq (a.get (x._1), b.get (x._1)).flatten

为了避免碰巧只存在于 a 或 b 中的元素,过滤器很方便:

(for (x <- a) yield 
  x._1 -> Seq (a.get (x._1), b.get (x._1)).flatten).filter (_._2.size == 2)

需要 Flatten,因为 b.get (x._1) 返回一个 Option。为了使扁平化工作,第一个元素也必须是一个选项,所以我们不能在这里只使用 x._2 。

对于序列,它也有效:

scala> val b = Map (1 -> Seq(1, 11, 111), 2 -> Seq(2, 22), 3 -> Seq(33, 333), 5 -> Seq(55, 5, 5555))
b: scala.collection.immutable.Map[Int,Seq[Int]] = Map(1 -> List(1, 11, 111), 2 -> List(2, 22), 3 -> List(33, 333), 5 -> List(55, 5, 5555))

scala> val a = Map (1 -> Seq(1, 101), 2 -> Seq(2, 212, 222), 3 -> Seq (3, 3443), 4 -> (44, 4, 41214))
a: scala.collection.immutable.Map[Int,ScalaObject with Equals] = Map(1 -> List(1, 101), 2 -> List(2, 212, 222), 3 -> List(3, 3443), 4 -> (44,4,41214))

scala> (for (x <- a) yield x._1 -> Seq (a.get (x._1), b.get (x._1)).flatten).filter (_._2.size == 2) 
res85: scala.collection.immutable.Map[Int,Seq[ScalaObject with Equals]] = Map(1 -> List(List(1, 101), List(1, 11, 111)), 2 -> List(List(2, 212, 222), List(2, 22)), 3 -> List(List(3, 3443), List(33, 333)))

Here is my first approach before looking for the other solutions:

for (x <- a) yield 
  x._1 -> Seq (a.get (x._1), b.get (x._1)).flatten

To avoid elements which happen to exist only in a or b, a filter is handy:

(for (x <- a) yield 
  x._1 -> Seq (a.get (x._1), b.get (x._1)).flatten).filter (_._2.size == 2)

Flatten is needed, because b.get (x._1) returns an Option. To make flatten work, the first element has to be an option too, so we can't just use x._2 here.

For sequences, it works too:

scala> val b = Map (1 -> Seq(1, 11, 111), 2 -> Seq(2, 22), 3 -> Seq(33, 333), 5 -> Seq(55, 5, 5555))
b: scala.collection.immutable.Map[Int,Seq[Int]] = Map(1 -> List(1, 11, 111), 2 -> List(2, 22), 3 -> List(33, 333), 5 -> List(55, 5, 5555))

scala> val a = Map (1 -> Seq(1, 101), 2 -> Seq(2, 212, 222), 3 -> Seq (3, 3443), 4 -> (44, 4, 41214))
a: scala.collection.immutable.Map[Int,ScalaObject with Equals] = Map(1 -> List(1, 101), 2 -> List(2, 212, 222), 3 -> List(3, 3443), 4 -> (44,4,41214))

scala> (for (x <- a) yield x._1 -> Seq (a.get (x._1), b.get (x._1)).flatten).filter (_._2.size == 2) 
res85: scala.collection.immutable.Map[Int,Seq[ScalaObject with Equals]] = Map(1 -> List(List(1, 101), List(1, 11, 111)), 2 -> List(List(2, 212, 222), List(2, 22)), 3 -> List(List(3, 3443), List(33, 333)))
舂唻埖巳落 2024-12-16 19:17:26

所以我对这两种解决方案都不太满意(我想构建一种新类型,所以半群感觉不太合适,而且 Infinity 的解决方案似乎相当复杂),所以我暂时采用了这种解决方案。我很高兴看到它得到改进:

def merge[A,B,C](a : Map[A,B], b : Map[A,B])(c : (B,B) => C) = {
  for (
    key <- (a.keySet ++ b.keySet);
    aval <- a.get(key); bval <- b.get(key)
  ) yield c(aval, bval)
}
merge(a,b){Seq(_,_)}

我希望当两个映射中都不存在键时不返回任何内容(这与其他解决方案不同),但是指定此方法的方法会很好。

So I wasn't quite happy with either solution (I want to build a new type, so semigroup doesn't really feel appropriate, and Infinity's solution seemed quite complex), so I've gone with this for the moment. I'd be happy to see it improved:

def merge[A,B,C](a : Map[A,B], b : Map[A,B])(c : (B,B) => C) = {
  for (
    key <- (a.keySet ++ b.keySet);
    aval <- a.get(key); bval <- b.get(key)
  ) yield c(aval, bval)
}
merge(a,b){Seq(_,_)}

I wanted the behaviour of returning nothing when a key wasn't present in either map (which differs from other solutions), but a way of specifying this would be nice.

寄与心 2024-12-16 19:17:26
def merge[A,B,C,D](b : Map[A,B], c : Map[A,C])(d : (Option[B],Option[C]) => D): Map[A,D] = {
  (b.keySet ++ c.keySet).map(k => k -> d(b.get(k), c.get(k))).toMap
}

def optionSeqBiFunctionK[A]:(Option[A], Option[A]) => Seq[A] = _.toSeq ++ _.toSeq

merge(a,b)(optionSeqBiFunctionK)

def merge[A,B,C,D](b : Map[A,B], c : Map[A,C])(d : (Option[B],Option[C]) => D): Map[A,D] = {
  (b.keySet ++ c.keySet).map(k => k -> d(b.get(k), c.get(k))).toMap
}

def optionSeqBiFunctionK[A]:(Option[A], Option[A]) => Seq[A] = _.toSeq ++ _.toSeq

merge(a,b)(optionSeqBiFunctionK)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文