在满足谓词的每个元素处拆分列表（Scala）

发布于 2024-12-02 19:13:42 字数 615 浏览 0 评论 0原文

在文本文件中，我的数据格式如下：

1)
text
text
2)
more text
3)
even more text
more even text
even more text
...

我使用以下内容将其作为字符串列表读取：

val input = io.Source.fromFile("filename.txt").getLines().toList

我想将列表分解为以 1)、2 开头的子列表） 等等。

我想出了：

val subLists =
  input.foldRight( List(List[String]()) ) {
    (x, acc) =>
      if (x.matches("""[0-9]+\)""")) List() :: (x :: acc.head) :: acc.tail
      else (x :: acc.head) :: acc.tail
  }.tail

这可以更简单地实现吗？如果有一个内置方法可以在满足谓词的每个元素上拆分集合，那就太好了（提示，提示，库设计者:)）。

原文

In a text file I have data in the form:

1)
text
text
2)
more text
3)
even more text
more even text
even more text
...

I read it as a list of Strings using the following:

val input = io.Source.fromFile("filename.txt").getLines().toList

I want to break the list down into sub-lists starting with 1), 2) etc.

I've come up with:

val subLists =
  input.foldRight( List(List[String]()) ) {
    (x, acc) =>
      if (x.matches("""[0-9]+\)""")) List() :: (x :: acc.head) :: acc.tail
      else (x :: acc.head) :: acc.tail
  }.tail

Can this be achieved more simply? What would be really nice would be if there were a built-in method to split a collection on every element that satisfies a predicate (hint, hint, library designers :)).

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

若相惜即相离 2024-12-09 19:13:42

带有复杂参数的 foldRight 通常表明您也可以使用递归来编写此代码，并在使用时将其分解为自己的方法。这就是我的想法。首先，我们来概括一下
到通用方法 groupPrefix：

 /** Returns shortest possible list of lists xss such that
  *   - xss.flatten == xs
  *   - No sublist in xss contains an element matching p in its tail
  */
 def groupPrefix[T](xs: List[T])(p: T => Boolean): List[List[T]] = xs match {
   case List() => List()
   case x :: xs1 => 
     val (ys, zs) = xs1 span (!p(_))
     (x :: ys) :: groupPrefix(zs)(p)  
 }

现在您只需调用即可获得结果

 groupPrefix(input)(_ matches """\d+\)""")

foldRight with a complicated argument is usually an indication that you might as well write this using recursion, and factor it out to its own method, while you are at it. Here's what I came up with. First, let's generalize
to a generic method, groupPrefix:

 /** Returns shortest possible list of lists xss such that
  *   - xss.flatten == xs
  *   - No sublist in xss contains an element matching p in its tail
  */
 def groupPrefix[T](xs: List[T])(p: T => Boolean): List[List[T]] = xs match {
   case List() => List()
   case x :: xs1 => 
     val (ys, zs) = xs1 span (!p(_))
     (x :: ys) :: groupPrefix(zs)(p)  
 }

Now you get the result simply by calling

 groupPrefix(input)(_ matches """\d+\)""")

回复收藏 0 原文

逆夏时光 2024-12-09 19:13:42

我很荣幸在伟大的@MartinOdersky 旁边添加答案！

从 Scala 2.13 开始，我们可以使用 List.unfold：

List.unfold(input) {
  case Nil =>
    None
  case x :: as =>
    as.span(!_.matches("""\d+\)""")) match {
      case (prefix, Nil) =>
        Some(x :: prefix, List.empty)
      case (prefix, suffix) =>
        Some(x :: prefix, suffix)
    }
}

代码运行于 Scastie 。

I have the honor, to add an answer next to the great @MartinOdersky!

From Scala 2.13 we can use the List.unfold:

List.unfold(input) {
  case Nil =>
    None
  case x :: as =>
    as.span(!_.matches("""\d+\)""")) match {
      case (prefix, Nil) =>
        Some(x :: prefix, List.empty)
      case (prefix, suffix) =>
        Some(x :: prefix, suffix)
    }
}

Code run at Scastie.

回复收藏 0 原文

~没有更多了~