模式匹配和无限流

发布于 2024-12-06 01:45:30 字数 1564 浏览 1 评论 0原文

因此,我正在努力自学 Scala,我一直在玩的东西之一就是 Stream 类。我尝试使用 经典 Haskell 版本的 Dijkstra 解决方案 到 Hamming 的简单翻译数字问题:

object LazyHammingBad {
  private def merge(a: Stream[BigInt], b: Stream[BigInt]): Stream[BigInt] =
    (a, b) match {
      case (x #:: xs, y #:: ys) =>
        if (x < y) x #:: merge(xs, b)
        else if (y < x) y #:: merge(a, ys)
        else x #:: merge(xs, ys)
    }

  val numbers: Stream[BigInt] =
    1 #:: merge(numbers map { _ * 2 },
      merge(numbers map { _ * 3 }, numbers map { _ * 5 }))
}

在解释器中尝试一下这个问题很快就导致了失望:

scala> LazyHammingBad.numbers.take(10).toList
java.lang.StackOverflowError

我决定看看其他人是否使用 Haskell 方法在 Scala 中解决了这个问题,并改编了此解决方案:

object LazyHammingGood {
  private def merge(a: Stream[BigInt], b: Stream[BigInt]): Stream[BigInt] =
    if (a.head < b.head) a.head #:: merge(a.tail, b)
    else if (b.head < a.head) b.head #:: merge(a, b.tail)
    else a.head #:: merge(a.tail, b.tail)

  val numbers: Stream[BigInt] = 
    1 #:: merge(numbers map {_ * 2}, 
            merge(numbers map {_ * 3}, numbers map {_ * 5}))
}

这个解决方案效果很好,但我仍然想知道我在 LazyHammingBad 中是如何出错的。使用 #:: 解构 x #:: xs 是否会出于某种原因强制对 xs 求值?有没有什么方法可以安全地使用无限流的模式匹配,或者如果您不想让事情崩溃,您是否只需要使用 headtail

So, I'm working to teach myself Scala, and one of the things I've been playing with is the Stream class. I tried to use a naïve translation of the classic Haskell version of Dijkstra's solution to the Hamming number problem:

object LazyHammingBad {
  private def merge(a: Stream[BigInt], b: Stream[BigInt]): Stream[BigInt] =
    (a, b) match {
      case (x #:: xs, y #:: ys) =>
        if (x < y) x #:: merge(xs, b)
        else if (y < x) y #:: merge(a, ys)
        else x #:: merge(xs, ys)
    }

  val numbers: Stream[BigInt] =
    1 #:: merge(numbers map { _ * 2 },
      merge(numbers map { _ * 3 }, numbers map { _ * 5 }))
}

Taking this for a spin in the interpreter led quickly to disappointment:

scala> LazyHammingBad.numbers.take(10).toList
java.lang.StackOverflowError

I decided to look to see if other people had solved the problem in Scala using the Haskell approach, and adapted this solution from Rosetta Code:

object LazyHammingGood {
  private def merge(a: Stream[BigInt], b: Stream[BigInt]): Stream[BigInt] =
    if (a.head < b.head) a.head #:: merge(a.tail, b)
    else if (b.head < a.head) b.head #:: merge(a, b.tail)
    else a.head #:: merge(a.tail, b.tail)

  val numbers: Stream[BigInt] = 
    1 #:: merge(numbers map {_ * 2}, 
            merge(numbers map {_ * 3}, numbers map {_ * 5}))
}

This one worked nicely, but I still wonder how I went wrong in LazyHammingBad. Does using #:: to destructure x #:: xs force the evaluation of xs for some reason? Is there any way to use pattern matching safely with infinite streams, or do you just have to use head and tail if you don't want things to blow up?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

橘亓 2024-12-13 01:45:30

匹配 {case x#::xs =>...val (x, xs) = (a.head, a.tail) 大致相同。因此,坏版本和好版本之间的区别在于,在坏版本中,您在一开始就调用 a.tailb.tail ,而不是仅仅使用它们来构建结果流的尾部。此外,当您在 #:: 右侧使用它们时(不是模式匹配,而是构建结果,如 #:: merge(abtail) 中,您实际上并没有调用merge,这只会在稍后访问返回的 Stream 的尾部时完成,因此在好的版本中,对 merge 的调用根本不会调用 tail 在坏的版本中,它会调用它。现在

,如果您考虑数字,甚至是简化版本,请说1 #:: merge(numbers, anotherStream),当您调用时,您会调用 tail (如 take(10) 那样),< code>merge 必须对 numbers 调用 tail,它会用 numbers 调用 merge。 > 作为参数,调用 tails numbers,它调用 merge,它调用 tail...

相比之下,在超级懒惰的 Haskell 中,当你进行模式匹配时,它几乎不执行任何操作当您执行 case l of x:xs 时,它将评估 l 足以知道它是空列表还是缺点。
如果这确实是一个缺点,xxs 将作为两个 thunk 提供,这些函数最终将在稍后提供对内容的访问。 Scala 中最接近的等效项是仅测试 empty

另请注意,在 Scala Stream 中,虽然 tail 是惰性的,但 head 却不是。当您有一个(非空)流时,必须知道头部。这意味着当您获得流的尾部(本身就是一个流)时,必须计算其头部(即原始流的第二个元素)。这有时会出现问题,但在您的示例中,您甚至在到达那里之前就失败了。

a match {case x#::xs =>... is about the same as val (x, xs) = (a.head, a.tail). So the difference between the bad version and the good one, is that in that in the bad version, you're calling a.tail and b.tail right at the start, instead of just use them to build the tail of the resulting stream. Furthermore when you use them at the right of #:: (not pattern matching, but building the result, as in #:: merge(a.b.tail) you are not actually calling merge, that will be done only later, when accessing the tail of the returned Stream. So in the good version, a call to merge does not call tail at all. In the bad version, it calls it right at start.

Now if you consider numbers, or even a simplified version, say 1 #:: merge(numbers, anotherStream), when you call you call tail on that (as take(10) will), merge has to be evaluated. You call tail on numbers, which call merge with numbers as parameters, which calls tails on numbers, which calls merge, which calls tail...

By contrast, in super lazy Haskell, when you pattern match, it does barely any work. When you do case l of x:xs, it will evaluate l just enough to know whether it is an empty list or a cons.
If it is indeed a cons, x and xs will be available as two thunks, functions that will eventually give access, later, to content. The closest equivalent in Scala would be to just test empty.

Note also that in Scala Stream, while the tail is lazy, the head is not. When you have a (non empty) Stream, the head has to be known. Which means that when you get the tail of the stream, itself a stream, its head, that is the second element of the original stream, has to be computed. This is sometimes problematic, but in your example, you fail before even getting there.

恰似旧人归 2024-12-13 01:45:30

请注意,您可以通过为 Stream 定义更好的模式匹配器来完成您想要的操作:

这是我刚刚在 Scala Worksheet

object HammingTest {
  // A convenience object for stream pattern matching
  object #:: {
    class TailWrapper[+A](s: Stream[A]) {
      def unwrap = s.tail
    }
    object TailWrapper {
      implicit def unwrap[A](wrapped: TailWrapper[A]) = wrapped.unwrap
    }
    def unapply[A](s: Stream[A]): Option[(A, TailWrapper[A])] = {
      if (s.isEmpty) None
      else {
        Some(s.head, new TailWrapper(s))
      }
    }
  }

  def merge(a: Stream[BigInt], b: Stream[BigInt]): Stream[BigInt] =
    (a, b) match {
      case (x #:: xs, y #:: ys) =>
        if (x < y) x #:: merge(xs, b)
        else if (y < x) y #:: merge(a, ys)
        else x #:: merge(xs, ys)
    }                                             //> merge: (a: Stream[BigInt], b: Stream[BigInt])Stream[BigInt]

  lazy val numbers: Stream[BigInt] =
    1 #:: merge(numbers map { _ * 2 }, merge(numbers map { _ * 3 }, numbers map { _ * 5 }))
                                                  //> numbers  : Stream[BigInt] = <lazy>
  numbers.take(10).toList                         //> res0: List[BigInt] = List(1, 2, 3, 4, 5, 6, 8, 9, 10, 12)
}

现在您只需确保 Scala 找到您的 对象 #:: 而不是 Stream 中的对象。 class 每当它进行模式匹配时。为了方便这一点,最好使用不同的名称,例如 #>:##::,然后记住在模式匹配时始终使用该名称。

如果您需要匹配空流,请使用case Stream.Empty。使用 case Stream() 会尝试评估模式匹配中的整个流,这会导致悲伤。

Note that you can do what you want by defining a better pattern matcher for Stream:

Here's a bit I just pulled together in a Scala Worksheet:

object HammingTest {
  // A convenience object for stream pattern matching
  object #:: {
    class TailWrapper[+A](s: Stream[A]) {
      def unwrap = s.tail
    }
    object TailWrapper {
      implicit def unwrap[A](wrapped: TailWrapper[A]) = wrapped.unwrap
    }
    def unapply[A](s: Stream[A]): Option[(A, TailWrapper[A])] = {
      if (s.isEmpty) None
      else {
        Some(s.head, new TailWrapper(s))
      }
    }
  }

  def merge(a: Stream[BigInt], b: Stream[BigInt]): Stream[BigInt] =
    (a, b) match {
      case (x #:: xs, y #:: ys) =>
        if (x < y) x #:: merge(xs, b)
        else if (y < x) y #:: merge(a, ys)
        else x #:: merge(xs, ys)
    }                                             //> merge: (a: Stream[BigInt], b: Stream[BigInt])Stream[BigInt]

  lazy val numbers: Stream[BigInt] =
    1 #:: merge(numbers map { _ * 2 }, merge(numbers map { _ * 3 }, numbers map { _ * 5 }))
                                                  //> numbers  : Stream[BigInt] = <lazy>
  numbers.take(10).toList                         //> res0: List[BigInt] = List(1, 2, 3, 4, 5, 6, 8, 9, 10, 12)
}

Now you just need to make sure that Scala finds your object #:: instead of the one in Stream.class whenever it's doing pattern matching. To facilitate that, it might be best to use a different name like #>: or ##:: and then just remember to always use that name when pattern matching.

If you ever need to match the empty stream, use case Stream.Empty. Using case Stream() will attempt to evaluate your entire stream there in the pattern match, which will lead to sadness.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文