在满足谓词的每个元素处拆分列表(Scala)

发布于 2024-12-02 19:13:42 字数 615 浏览 0 评论 0原文

在文本文件中,我的数据格式如下:

1)
text
text
2)
more text
3)
even more text
more even text
even more text
...

我使用以下内容将其作为字符串列表读取:

val input = io.Source.fromFile("filename.txt").getLines().toList

我想将列表分解为以 1)2 开头的子列表) 等等。

我想出了:

val subLists =
  input.foldRight( List(List[String]()) ) {
    (x, acc) =>
      if (x.matches("""[0-9]+\)""")) List() :: (x :: acc.head) :: acc.tail
      else (x :: acc.head) :: acc.tail
  }.tail

这可以更简单地实现吗?如果有一个内置方法可以在满足谓词的每个元素上拆分集合,那就太好了(提示,提示,库设计者:))。

In a text file I have data in the form:

1)
text
text
2)
more text
3)
even more text
more even text
even more text
...

I read it as a list of Strings using the following:

val input = io.Source.fromFile("filename.txt").getLines().toList

I want to break the list down into sub-lists starting with 1), 2) etc.

I've come up with:

val subLists =
  input.foldRight( List(List[String]()) ) {
    (x, acc) =>
      if (x.matches("""[0-9]+\)""")) List() :: (x :: acc.head) :: acc.tail
      else (x :: acc.head) :: acc.tail
  }.tail

Can this be achieved more simply? What would be really nice would be if there were a built-in method to split a collection on every element that satisfies a predicate (hint, hint, library designers :)).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

若相惜即相离 2024-12-09 19:13:42

带有复杂参数的 foldRight 通常表明您也可以使用递归来编写此代码,并在使用时将其分解为自己的方法。这就是我的想法。首先,我们来概括一下
到通用方法 groupPrefix:

 /** Returns shortest possible list of lists xss such that
  *   - xss.flatten == xs
  *   - No sublist in xss contains an element matching p in its tail
  */
 def groupPrefix[T](xs: List[T])(p: T => Boolean): List[List[T]] = xs match {
   case List() => List()
   case x :: xs1 => 
     val (ys, zs) = xs1 span (!p(_))
     (x :: ys) :: groupPrefix(zs)(p)  
 }

现在您只需调用即可获得结果

 groupPrefix(input)(_ matches """\d+\)""")

foldRight with a complicated argument is usually an indication that you might as well write this using recursion, and factor it out to its own method, while you are at it. Here's what I came up with. First, let's generalize
to a generic method, groupPrefix:

 /** Returns shortest possible list of lists xss such that
  *   - xss.flatten == xs
  *   - No sublist in xss contains an element matching p in its tail
  */
 def groupPrefix[T](xs: List[T])(p: T => Boolean): List[List[T]] = xs match {
   case List() => List()
   case x :: xs1 => 
     val (ys, zs) = xs1 span (!p(_))
     (x :: ys) :: groupPrefix(zs)(p)  
 }

Now you get the result simply by calling

 groupPrefix(input)(_ matches """\d+\)""")
逆夏时光 2024-12-09 19:13:42

我很荣幸在伟大的@MartinOdersky 旁边添加答案!

从 Scala 2.13 开始,我们可以使用 List.unfold

List.unfold(input) {
  case Nil =>
    None
  case x :: as =>
    as.span(!_.matches("""\d+\)""")) match {
      case (prefix, Nil) =>
        Some(x :: prefix, List.empty)
      case (prefix, suffix) =>
        Some(x :: prefix, suffix)
    }
}

代码运行于 Scastie

I have the honor, to add an answer next to the great @MartinOdersky!

From Scala 2.13 we can use the List.unfold:

List.unfold(input) {
  case Nil =>
    None
  case x :: as =>
    as.span(!_.matches("""\d+\)""")) match {
      case (prefix, Nil) =>
        Some(x :: prefix, List.empty)
      case (prefix, suffix) =>
        Some(x :: prefix, suffix)
    }
}

Code run at Scastie.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文