如何展平使用 I/O 的嵌套 For Compriion?

发布于 2024-12-03 09:22:38 字数 2512 浏览 0 评论 0原文

我无法将嵌套的 For 生成器展平为单个 For 生成器。

我创建了 MapSerializer 来保存和加载地图。

MapSerializer.scala 列表:

import java.io.{ObjectInputStream, ObjectOutputStream}

object MapSerializer {
  def loadMap(in: ObjectInputStream): Map[String, IndexedSeq[Int]] =
    (for (_ <- 1 to in.readInt()) yield {
      val key = in.readUTF()
      for (_ <- 1 to in.readInt()) yield {
        val value = in.readInt()
        (key, value)
      }
    }).flatten.groupBy(_ _1).mapValues(_ map(_ _2))

  def saveMap(out: ObjectOutputStream, map: Map[String, Seq[Int]]) {
    out.writeInt(map size)
    for ((key, values) <- map) {
      out.writeUTF(key)
      out.writeInt(values size)
      values.foreach(out.writeInt(_))
    }
  }
}

修改 loadMap 以在生成器内分配 key 会导致它失败:

def loadMap(in: ObjectInputStream): Map[String, IndexedSeq[Int]] =
  (for (_ <- 1 to in.readInt();
        key = in.readUTF()) yield {
    for (_ <- 1 to in.readInt()) yield {
      val value = in.readInt()
      (key, value)
    }
  }).flatten.groupBy(_ _1).mapValues(_ map(_ _2))

这是我得到的堆栈跟踪:

java.io.UTFDataFormatException
    at java.io.ObjectInputStream$BlockDataInputStream.readWholeUTFSpan(ObjectInputStream.java)
    at java.io.ObjectInputStream$BlockDataInputStream.readOpUTFSpan(ObjectInputStream.java)
    at java.io.ObjectInputStream$BlockDataInputStream.readWholeUTFSpan(ObjectInputStream.java)
    at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java)
    at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2819)
    at java.io.ObjectInputStream.readUTF(ObjectInputStream.java:1050)
    at MapSerializer$$anonfun$loadMap$1.apply(MapSerializer.scala:8)
    at MapSerializer$$anonfun$loadMap$1.apply(MapSerializer.scala:7)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:194)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:194)
    at scala.collection.immutable.Range.foreach(Range.scala:76)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:194)
    at scala.collection.immutable.Range.map(Range.scala:43)
    at MapSerializer$.loadMap(MapSerializer.scala:7)

我想要将加载代码扁平化为单个“用于理解”,但我收到错误,表明它要么以不同的顺序执行,要么重复我不希望它重复的步骤。

为什么将key的分配移动到生成器中会导致生成器失败?

我可以将其扁平化为单个生成器吗?如果是这样,那个发电机会是什么?

I am having trouble flattening a nested For Generator into a single For Generator.

I created MapSerializer to save and load Maps.

Listing of MapSerializer.scala:

import java.io.{ObjectInputStream, ObjectOutputStream}

object MapSerializer {
  def loadMap(in: ObjectInputStream): Map[String, IndexedSeq[Int]] =
    (for (_ <- 1 to in.readInt()) yield {
      val key = in.readUTF()
      for (_ <- 1 to in.readInt()) yield {
        val value = in.readInt()
        (key, value)
      }
    }).flatten.groupBy(_ _1).mapValues(_ map(_ _2))

  def saveMap(out: ObjectOutputStream, map: Map[String, Seq[Int]]) {
    out.writeInt(map size)
    for ((key, values) <- map) {
      out.writeUTF(key)
      out.writeInt(values size)
      values.foreach(out.writeInt(_))
    }
  }
}

Modifying loadMap to assign key within the generator causes it to fail:

def loadMap(in: ObjectInputStream): Map[String, IndexedSeq[Int]] =
  (for (_ <- 1 to in.readInt();
        key = in.readUTF()) yield {
    for (_ <- 1 to in.readInt()) yield {
      val value = in.readInt()
      (key, value)
    }
  }).flatten.groupBy(_ _1).mapValues(_ map(_ _2))

Here is the stacktrace I get:

java.io.UTFDataFormatException
    at java.io.ObjectInputStream$BlockDataInputStream.readWholeUTFSpan(ObjectInputStream.java)
    at java.io.ObjectInputStream$BlockDataInputStream.readOpUTFSpan(ObjectInputStream.java)
    at java.io.ObjectInputStream$BlockDataInputStream.readWholeUTFSpan(ObjectInputStream.java)
    at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java)
    at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2819)
    at java.io.ObjectInputStream.readUTF(ObjectInputStream.java:1050)
    at MapSerializer$anonfun$loadMap$1.apply(MapSerializer.scala:8)
    at MapSerializer$anonfun$loadMap$1.apply(MapSerializer.scala:7)
    at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:194)
    at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:194)
    at scala.collection.immutable.Range.foreach(Range.scala:76)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:194)
    at scala.collection.immutable.Range.map(Range.scala:43)
    at MapSerializer$.loadMap(MapSerializer.scala:7)

I would like to flatten the loading code to a single For Comprehension, but I get errors that suggest that it is either executing in a different order or repeating steps I am not expecting it to repeat.

Why is it that moving the assignment of key into the generator causes it to fail?

Can I flatten this into a single generator? If so, what would that generator be?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

左秋 2024-12-10 09:22:38

感谢您在问题中提供自包含的编译代码。我认为你不想压平循环,因为结构不平坦。然后,您需要使用groupBy来恢复结构。此外,如果您将“zero -> Seq()”作为映射的元素,它也会丢失。使用这个简单的映射可以避免 groupBy 并保留映射到空序列的元素:

def loadMap(in: ObjectInputStream): Map[String, IndexedSeq[Int]] = {
  val size = in.readInt
  (1 to size).map{ _ =>
    val key = in.readUTF
    val nval = in.readInt
    key -> (1 to nval).map(_ => in.readInt)
  }(collection.breakOut)
}

我使用 breakOut 生成正确的类型,否则我认为编译器会抱怨泛型 Map 和不可变的 Map 不匹配。您还可以使用Map() ++ (...)

注意:我通过对 for 循环感到困惑并开始使用 flatMap 和 map 进行重写来找到此解决方案:

val tuples = (1 to size).flatMap{ _ =>
  val key = in.readUTF
  println("key " + key)
  val nval = in.readInt
  (1 to nval).map(_ => key -> in.readInt)
}

我认为在 for 循环中,当您不使用某些生成器时会发生某些情况。我认为这相当于:

val tuples = for {
  _ <- 1 to size
  key = in.readUTF
  nval = in.readInt
  _ <- 1 to nval
  value = in.readInt
} yield { key -> value }

但事实并非如此,所以我认为我在翻译中遗漏了一些东西。

编辑:找出单个 for 循环的问题所在。简短的故事:for 循环中定义的翻译导致在执行内部循环之前连续调用 key = in.readUTF 语句。要解决此问题,请使用 viewforce

val tuples = (for {
  _ <- (1 to size).view
  key = in.readUTF
  nval = in.readInt
  _ <- 1 to nval
  value = in.readInt
} yield { key -> value }).force

可以使用这段代码更清楚地演示该问题:

val iter = Iterator.from(1)
val tuple = for {
  _ <- 1 to 3
  outer = iter.next
  _ <- 1 to 3
  inner = iter.next
} yield (outer, inner)

它返回 Vector((1,4), (1,5)、(1,6)、(2,7)、(2,8)、(2,9)、(3,10)、(3,11)、(3,12)) 表明所有外部值都在内部值之前计算。这是因为它或多或少是翻译的类似于:

for { 
  (i, outer) <- for (i <- (1 to 3)) yield (i, iter.next)
  _ <- 1 to 3
 inner = iter.next
} yield (outer, inner)

这首先计算所有外部 iter.next 。回到最初的用例,所有 in.readUTF 值都将在 in.readInt 之前连续调用。

Thank you for self contained compiling code in your question. I don't think you want to flatten the loops as the structure is not flat. You then need to use groupBy to recover the structure. Also if you have "zero -> Seq()" as an element of the map, it would be lost. Using this simple map avoids the groupBy and preserves the elements mapped to empty sequences:

def loadMap(in: ObjectInputStream): Map[String, IndexedSeq[Int]] = {
  val size = in.readInt
  (1 to size).map{ _ =>
    val key = in.readUTF
    val nval = in.readInt
    key -> (1 to nval).map(_ => in.readInt)
  }(collection.breakOut)
}

I use breakOut to generate the right type as otherwise I think the compilers complains about generic Map and immutable Map mismatch. You can also use Map() ++ (...).

Note: I arrived at this solution by being confused by your for loop and starting to rewrite using as flatMap and map:

val tuples = (1 to size).flatMap{ _ =>
  val key = in.readUTF
  println("key " + key)
  val nval = in.readInt
  (1 to nval).map(_ => key -> in.readInt)
}

I think in the for loop, something happens when you don't use some of the generator. I though this would be equivalent to:

val tuples = for {
  _ <- 1 to size
  key = in.readUTF
  nval = in.readInt
  _ <- 1 to nval
  value = in.readInt
} yield { key -> value }

But this is not the case, so I think I'm missing something in the translation.

Edit: figured out what's wrong with a single for loop. Short story: the translation of definitions within for loops caused the key = in.readUTF statement to be called consecutively before the inner loop is executed. To work around this, use view and force:

val tuples = (for {
  _ <- (1 to size).view
  key = in.readUTF
  nval = in.readInt
  _ <- 1 to nval
  value = in.readInt
} yield { key -> value }).force

The issue can be demonstrated more clearly with this piece of code:

val iter = Iterator.from(1)
val tuple = for {
  _ <- 1 to 3
  outer = iter.next
  _ <- 1 to 3
  inner = iter.next
} yield (outer, inner)

It returns Vector((1,4), (1,5), (1,6), (2,7), (2,8), (2,9), (3,10), (3,11), (3,12)) which shows that all outer values are evaluated before inner values. This is due to the fact that it is more or less translated to something like:

for { 
  (i, outer) <- for (i <- (1 to 3)) yield (i, iter.next)
  _ <- 1 to 3
 inner = iter.next
} yield (outer, inner)

This computes all outer iter.next first. Going back to the original use case, all in.readUTF values would be called consecutively before in.readInt.

Hello爱情风 2024-12-10 09:22:38

这是我最终部署的 @huynhjl 答案的压缩版本:

def loadMap(in: ObjectInputStream): Map[String, IndexedSeq[Int]] =
  ((1 to in.readInt()) map { _ =>
    in.readUTF() -> ((1 to in.readInt()) map { _ => in.readInt()) }
  })(collection.breakOut)

这个版本的优点是没有直接分配。

Here is the compacted version of @huynhjl's answer that I eventually deployed:

def loadMap(in: ObjectInputStream): Map[String, IndexedSeq[Int]] =
  ((1 to in.readInt()) map { _ =>
    in.readUTF() -> ((1 to in.readInt()) map { _ => in.readInt()) }
  })(collection.breakOut)

The advantage of this version is that there are no direct assignments.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文