Scala 分区/收集用法

发布于 2024-10-13 15:24:12 字数 90 浏览 6 评论 0原文

是否可以使用一次 collect 调用来创建 2 个新列表?如果没有,我该如何使用partition来做到这一点?

Is it possible to use one call to collect to make 2 new lists? If not, how can I do this using partition?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

李白 2024-10-20 15:24:12

collect (定义于 TraversableLike 并在所有子类中可用)可与集合和 PartialFunction 配合使用。碰巧的是,大括号内定义的一堆 case 子句是一个部分函数(请参阅 Scala 语言规范 [警告 - PDF])

与异常处理一样:

try {
  ... do something risky ...
} catch {
  //The contents of this catch block are a partial function
  case e: IOException => ...
  case e: OtherException => ...
}

这是定义仅接受的函数的便捷方法给定类型的一些值。

考虑在混合值列表上使用它:

val mixedList = List("a", 1, 2, "b", 19, 42.0) //this is a List[Any]
val results = mixedList collect {
  case s: String => "String:" + s
  case i: Int => "Int:" + i.toString
}

collect 方法的参数是 PartialFunction[Any,String]PartialFunction 因为它没有为 Any 类型(即 List 的类型)和 String 类型的所有可能输入定义> 因为这就是所有子句返回的内容。

如果您尝试使用 map 而不是 collect,则 mixedList 末尾的双精度值将导致 MatchError >。使用collect只会丢弃此值以及未定义 PartialFunction 的任何其他值。

一种可能的用途是将不同的逻辑应用于列表的元素:

var strings = List.empty[String]
var ints = List.empty[Int]
mixedList collect {
  case s: String => strings :+= s
  case i: Int => ints :+= i
}

虽然这只是一个示例,但使用这样的可变变量被许多人认为是一种战争犯罪 - 所以请不要这样做!

一个更好的解决方案是使用两次collect:

val strings = mixedList collect { case s: String => s }
val ints = mixedList collect { case i: Int => i }

或者如果您确定列表只包含两种类型的值,您可以使用partition,它将集合分成值取决于它们是否匹配某个谓词:

//if the list only contains Strings and Ints:
val (strings, ints) = mixedList partition { case s: String => true; case _ => false }

这里的问题是 stringsints 都是 List[Any] 类型,尽管您可以轻松地将它们强制返回到更类型安全的东西(也许通过使用collect...)

如果您已经有一个类型安全的集合并且想要拆分元素的某些其他属性,那么事情就是对您来说更容易一些:

val intList = List(2,7,9,1,6,5,8,2,4,6,2,9,8)
val (big,small) = intList partition (_ > 5)
//big and small are both now List[Int]s

希望总结这两种方法可以帮助您!

collect (defined on TraversableLike and available in all subclasses) works with a collection and a PartialFunction. It also just so happens that a bunch of case clauses defined inside braces are a partial function (See section 8.5 of the Scala Language Specification [warning - PDF])

As in exception handling:

try {
  ... do something risky ...
} catch {
  //The contents of this catch block are a partial function
  case e: IOException => ...
  case e: OtherException => ...
}

It's a handy way to define a function that will only accept some values of a given type.

Consider using it on a list of mixed values:

val mixedList = List("a", 1, 2, "b", 19, 42.0) //this is a List[Any]
val results = mixedList collect {
  case s: String => "String:" + s
  case i: Int => "Int:" + i.toString
}

The argument to to collect method is a PartialFunction[Any,String]. PartialFunction because it's not defined for all possible inputs of type Any (that being the type of the List) and String because that's what all the clauses return.

If you tried to use map instead of collect, the the double value at the end of mixedList would cause a MatchError. Using collect just discards this, as well as any other value for which the PartialFunction is not defined.

One possible use would be to apply different logic to elements of the list:

var strings = List.empty[String]
var ints = List.empty[Int]
mixedList collect {
  case s: String => strings :+= s
  case i: Int => ints :+= i
}

Although this is just an example, using mutable variables like this is considered by many to be a war crime - So please don't do it!

A much better solution is to use collect twice:

val strings = mixedList collect { case s: String => s }
val ints = mixedList collect { case i: Int => i }

Or if you know for certain that the list only contains two types of values, you can use partition, which splits a collections into values depending on whether or not they match some predicate:

//if the list only contains Strings and Ints:
val (strings, ints) = mixedList partition { case s: String => true; case _ => false }

The catch here is that both strings and ints are of type List[Any], though you can easily coerce them back to something more typesafe (perhaps by using collect...)

If you already have a type-safe collection and want to split on some other property of the elements, then things are a bit easier for you:

val intList = List(2,7,9,1,6,5,8,2,4,6,2,9,8)
val (big,small) = intList partition (_ > 5)
//big and small are both now List[Int]s

Hope that sums up how the two methods can help you out here!

喜爱纠缠 2024-10-20 15:24:12

不知道如何在不使用可变列表的情况下使用collect来做到这一点,但是partition也可以使用模式匹配(只是更详细一点)

List("a", 1, 2, "b", 19).partition { 
  case s:String => true
  case _ => false 
}

Not sure how to do it with collect without using mutable lists, but partition can use pattern matching as well (just a little more verbose)

List("a", 1, 2, "b", 19).partition { 
  case s:String => true
  case _ => false 
}
静待花开 2024-10-20 15:24:12

通常使用的 collect 的签名,例如 Seq

collect[B](pf: PartialFunction[A,B]): Seq[B]

实际上是一个特殊情况

collect[B, That](pf: PartialFunction[A,B])(
  implicit bf: CanBuildFrom[Seq[A], B, That]
): That

所以如果你在默认模式下使用它,答案是否定的,当然不是:你从中得到了一个序列。如果您按照 CanBuildFromBuilder,您会发现可以使 That 实际上成为两个序列,但它无法被告知一个项目应该进入哪个序列,因为部分函数只能说“是,我属于”或“不,我不属于”。

那么,如果您想要多个条件导致您的列表被分成一堆不同的部分,您该怎么办?一种方法是创建一个指标函数 A => Int,其中您的 A 映射到编号类,然后使用 groupBy。例如:

def optionClass(a: Any) = a match {
  case None => 0
  case Some(x) => 1
  case _ => 2
}
scala> List(None,3,Some(2),5,None).groupBy(optionClass)
res11: scala.collection.immutable.Map[Int,List[Any]] = 
  Map((2,List(3, 5)), (1,List(Some(2))), (0,List(None, None)))

现在您可以按类别(在本例中为 0、1 和 2)查找子列表。不幸的是,如果您想忽略某些输入,您仍然必须将它们放入一个类中(例如,在这种情况下您可能不关心 None 的多个副本)。

The signature of the normally-used collect on, say, Seq, is

collect[B](pf: PartialFunction[A,B]): Seq[B]

which is really a particular case of

collect[B, That](pf: PartialFunction[A,B])(
  implicit bf: CanBuildFrom[Seq[A], B, That]
): That

So if you use it in default mode, the answer is no, assuredly not: you get exactly one sequence out from it. If you follow CanBuildFrom through Builder, you see that it would be possible to make That actually be two sequences, but it would have no way of being told which sequence an item should go into, since the partial function can only say "yes, I belong" or "no, I do not belong".

So what do you do if you want to have multiple conditions that result in your list being split into a bunch of different pieces? One way is to create an indicator function A => Int, where your A is mapped into a numbered class, and then use groupBy. For example:

def optionClass(a: Any) = a match {
  case None => 0
  case Some(x) => 1
  case _ => 2
}
scala> List(None,3,Some(2),5,None).groupBy(optionClass)
res11: scala.collection.immutable.Map[Int,List[Any]] = 
  Map((2,List(3, 5)), (1,List(Some(2))), (0,List(None, None)))

Now you can look up your sub-lists by class (0, 1, and 2 in this case). Unfortunately, if you want to ignore some inputs, you still have to put them in a class (e.g. you probably don't care about the multiple copies of None in this case).

小清晰的声音 2024-10-20 15:24:12

我用这个。它的一大好处是它在一次迭代中结合了分区和映射。一个缺点是它确实分配了一堆临时对象(Either.LeftEither.Right 实例)

/**
 * Splits the input list into a list of B's and a list of C's, depending on which type of value the mapper function returns.
 */
def mapSplit[A,B,C](in: List[A])(mapper: (A) => Either[B,C]): (List[B], List[C]) = {
  @tailrec
  def mapSplit0(in: List[A], bs: List[B], cs: List[C]): (List[B], List[C]) = {
    in match {
      case a :: as =>
        mapper(a) match {
          case Left(b)  => mapSplit0(as, b :: bs, cs     )
          case Right(c) => mapSplit0(as, bs,      c :: cs)
        }
      case Nil =>
        (bs.reverse, cs.reverse)
    }
  }

  mapSplit0(in, Nil, Nil)
}

val got = mapSplit(List(1,2,3,4,5)) {
  case x if x % 2 == 0 => Left(x)
  case y               => Right(y.toString * y)
}

assertEquals((List(2,4),List("1","333","55555")), got)

I use this. One nice thing about it is it combines partitioning and mapping in one iteration. One drawback is that it does allocate a bunch of temporary objects (the Either.Left and Either.Right instances)

/**
 * Splits the input list into a list of B's and a list of C's, depending on which type of value the mapper function returns.
 */
def mapSplit[A,B,C](in: List[A])(mapper: (A) => Either[B,C]): (List[B], List[C]) = {
  @tailrec
  def mapSplit0(in: List[A], bs: List[B], cs: List[C]): (List[B], List[C]) = {
    in match {
      case a :: as =>
        mapper(a) match {
          case Left(b)  => mapSplit0(as, b :: bs, cs     )
          case Right(c) => mapSplit0(as, bs,      c :: cs)
        }
      case Nil =>
        (bs.reverse, cs.reverse)
    }
  }

  mapSplit0(in, Nil, Nil)
}

val got = mapSplit(List(1,2,3,4,5)) {
  case x if x % 2 == 0 => Left(x)
  case y               => Right(y.toString * y)
}

assertEquals((List(2,4),List("1","333","55555")), got)
因为看清所以看轻 2024-10-20 15:24:12

从 Scala 2.13 开始,大多数集合现在都提供了 partitionMap< /code>方法,根据返回 RightLeft 的函数对元素进行分区。

这允许我们根据类型(作为一个collect允许在分区列表中具有特定类型)或任何其他模式进行模式匹配:

val (strings, ints) =
  List("a", 1, 2, "b", 19).partitionMap {
    case s: String => Left(s)
    case x: Int    => Right(x)
  }
// strings: List[String] = List("a", "b")
// ints: List[Int] = List(1, 2, 19)

Starting in Scala 2.13, most collections are now provided with a partitionMap method which partitions elements based on a function which returns either Right or Left.

That allows us to pattern match based on the type (which as a collect enables having specific types in the partitioned lists) or any other pattern:

val (strings, ints) =
  List("a", 1, 2, "b", 19).partitionMap {
    case s: String => Left(s)
    case x: Int    => Right(x)
  }
// strings: List[String] = List("a", "b")
// ints: List[Int] = List(1, 2, 19)
灰色世界里的红玫瑰 2024-10-20 15:24:12

我在这里找不到这个基本问题的令人满意的解决方案。
我不需要关于collect的讲座,也不在乎这是否是某人的作业。另外,我不想要只适用于 List 的东西。

这是我的尝试。高效且与任何 TraversableOnce 兼容,甚至是字符串:

implicit class TraversableOnceHelper[A,Repr](private val repr: Repr)(implicit isTrav: Repr => TraversableOnce[A]) {

  def collectPartition[B,Left](pf: PartialFunction[A, B])
  (implicit bfLeft: CanBuildFrom[Repr, B, Left], bfRight: CanBuildFrom[Repr, A, Repr]): (Left, Repr) = {
    val left = bfLeft(repr)
    val right = bfRight(repr)
    val it = repr.toIterator
    while (it.hasNext) {
      val next = it.next
      if (!pf.runWith(left += _)(next)) right += next
    }
    left.result -> right.result
  }

  def mapSplit[B,C,Left,Right](f: A => Either[B,C])
  (implicit bfLeft: CanBuildFrom[Repr, B, Left], bfRight: CanBuildFrom[Repr, C, Right]): (Left, Right) = {
    val left = bfLeft(repr)
    val right = bfRight(repr)
    val it = repr.toIterator
    while (it.hasNext) {
      f(it.next) match {
        case Left(next) => left += next
        case Right(next) => right += next
      }
    }
    left.result -> right.result
  }
}

用法示例:

val (syms, ints) =
  Seq(Left('ok), Right(42), Right(666), Left('ko), Right(-1)) mapSplit identity

val ctx = Map('a -> 1, 'b -> 2) map {case(n,v) => n->(n,v)}
val (bound, unbound) = Vector('a, 'a, 'c, 'b) collectPartition ctx
println(bound: Vector[(Symbol, Int)], unbound: Vector[Symbol])

I could not find a satisfying solution to this basic problem here.
I don't need a lecture on collect and don't care if this is someone's homework. Also, I don't want something that works only for List.

So here is my stab at it. Efficient and compatible with any TraversableOnce, even strings:

implicit class TraversableOnceHelper[A,Repr](private val repr: Repr)(implicit isTrav: Repr => TraversableOnce[A]) {

  def collectPartition[B,Left](pf: PartialFunction[A, B])
  (implicit bfLeft: CanBuildFrom[Repr, B, Left], bfRight: CanBuildFrom[Repr, A, Repr]): (Left, Repr) = {
    val left = bfLeft(repr)
    val right = bfRight(repr)
    val it = repr.toIterator
    while (it.hasNext) {
      val next = it.next
      if (!pf.runWith(left += _)(next)) right += next
    }
    left.result -> right.result
  }

  def mapSplit[B,C,Left,Right](f: A => Either[B,C])
  (implicit bfLeft: CanBuildFrom[Repr, B, Left], bfRight: CanBuildFrom[Repr, C, Right]): (Left, Right) = {
    val left = bfLeft(repr)
    val right = bfRight(repr)
    val it = repr.toIterator
    while (it.hasNext) {
      f(it.next) match {
        case Left(next) => left += next
        case Right(next) => right += next
      }
    }
    left.result -> right.result
  }
}

Example usages:

val (syms, ints) =
  Seq(Left('ok), Right(42), Right(666), Left('ko), Right(-1)) mapSplit identity

val ctx = Map('a -> 1, 'b -> 2) map {case(n,v) => n->(n,v)}
val (bound, unbound) = Vector('a, 'a, 'c, 'b) collectPartition ctx
println(bound: Vector[(Symbol, Int)], unbound: Vector[Symbol])
纸伞微斜 2024-10-20 15:24:12

像这样的东西可能会有所帮助

def partitionMap[IN, A, B](seq: Seq[IN])(function: IN => Either[A, B]): (Seq[A], Seq[B]) = {
  val (eitherLeft, eitherRight) = seq.map(function).partition(_.isLeft)
  eitherLeft.map(_.left.get) -> eitherRight.map(_.right.get)
}

称其为

val seq: Seq[Any] = Seq(1, "A", 2, "B")
val (ints, strings) = CollectionUtils.partitionMap(seq) {
  case int: Int    => Left(int)
  case str: String => Right(str)
}
ints shouldBe Seq(1, 2)
strings shouldBe Seq("A", "B")

中的 API 类似

Advantage 是一个简单的 API,与 Scala 2.12 Disadvantage ;集合运行了两次并且缺少对 CanBuildFrom 的支持

Something like this could help

def partitionMap[IN, A, B](seq: Seq[IN])(function: IN => Either[A, B]): (Seq[A], Seq[B]) = {
  val (eitherLeft, eitherRight) = seq.map(function).partition(_.isLeft)
  eitherLeft.map(_.left.get) -> eitherRight.map(_.right.get)
}

To call it

val seq: Seq[Any] = Seq(1, "A", 2, "B")
val (ints, strings) = CollectionUtils.partitionMap(seq) {
  case int: Int    => Left(int)
  case str: String => Right(str)
}
ints shouldBe Seq(1, 2)
strings shouldBe Seq("A", "B")

Advantage is a simple API, similar with the one from Scala 2.12

Disadvantage; collection is ran twice and missing support for CanBuildFrom

皇甫轩 2024-10-20 15:24:12

我个人会为此使用foldLeft 或foldRight。与这里的其他一些答案相比,它有几个优点。没有使用 var,所以这是一个纯函数(如果你关心这种类型的事情)。仅遍历列表一次。不创建任何无关的 Either 对象。

折叠的想法是将列表转换为单一类型。然而,没有什么可以阻止我们让这个单一类型成为任意数量列表的元组。

此示例将一个列表转换为三个不同的列表:

  val list: List[Any] = List(1,"two", 3, "four", 5.5)

  // Start with 3 empty lists and prepend to them each time we find a new value
  list.foldRight( (List.empty[Int]), List.empty[String], List.empty[Double]) {
    (nextItem, newCollection) => {
      nextItem match {
        case i: Int => newCollection.copy(_1 = i :: newCollection._1)
        case s: String => newCollection.copy(_2 = s :: newCollection._2)
        case f: Double => newCollection.copy(_3 = f :: newCollection._3)
        case _ => newCollection
      }
    }
  }

I would personally use a foldLeft or foldRight for this. It has a couple advantages over the some of the other answers here. No use of var, so this is a pure function (if you care about that type of thing). Only one traversal through the list. Does not create any extraneous Either objects.

The idea of a fold is to convert a list into a single type. However, nothing is stopping us from having this single type be a Tuple of any number of lists.

This example converts a list into three different lists:

  val list: List[Any] = List(1,"two", 3, "four", 5.5)

  // Start with 3 empty lists and prepend to them each time we find a new value
  list.foldRight( (List.empty[Int]), List.empty[String], List.empty[Double]) {
    (nextItem, newCollection) => {
      nextItem match {
        case i: Int => newCollection.copy(_1 = i :: newCollection._1)
        case s: String => newCollection.copy(_2 = s :: newCollection._2)
        case f: Double => newCollection.copy(_3 = f :: newCollection._3)
        case _ => newCollection
      }
    }
  }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文