序列理解中的多重产量？

发布于 2024-09-10 01:24:16 字数 613 浏览 11 评论 0原文

我正在尝试学习 Scala 并尝试编写一个序列理解，从序列中提取一元组、二元组和三元组。例如，[1,2,3,4] 应该转换为（不是 Scala 语法）

[1; _,1; _,_,1; 2; 1,2; _,1,2; 3; 2,3; 1,2,3; 4; 3,4; 2,3,4]

在 Scala 2.8 中，我尝试了以下操作：

def trigrams(tokens : Seq[T]) = {
  var t1 : Option[T] = None
  var t2 : Option[T] = None
  for (t3 <- tokens) {
    yield t3
    yield (t2,t3)
    yield (t1,t2,Some(t3))
    t1 = t2
    t2 = t3
  }
}

但这不会编译为显然，在 for 理解中只允许一个 yield （也没有块语句）。是否有任何其他优雅的方式来获得相同的行为，只需一次传递数据？

原文

I'm trying to learn Scala and tried to write a sequence comprehension that extracts unigrams, bigrams and trigrams from a sequence. E.g., [1,2,3,4] should be transformed to (not Scala syntax)

[1; _,1; _,_,1; 2; 1,2; _,1,2; 3; 2,3; 1,2,3; 4; 3,4; 2,3,4]

In Scala 2.8, I tried the following:

def trigrams(tokens : Seq[T]) = {
  var t1 : Option[T] = None
  var t2 : Option[T] = None
  for (t3 <- tokens) {
    yield t3
    yield (t2,t3)
    yield (t1,t2,Some(t3))
    t1 = t2
    t2 = t3
  }
}

But this doesn't compile as, apparently, only one yield is allowed in a for-comprehension (no block statements either). Is there any other elegant way to get the same behavior, with only one pass over the data?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

要走干脆点 2024-09-17 01:24:16

for 循环中不能有多个 yield，因为 for 循环是 map（或 flatMap）操作的语法糖：

for (i <- collection) yield( func(i) )

转换为

collection map {i => func(i)}

完全没有 Yield

for (i <- collection) func(i)

转换为

collection foreach {i => func(i)}

因此 for 循环的整个主体变成单个闭包，并且 yield 关键字的存在决定了在集合上调用的函数是 map 还是 foreach （或者 <代码>flatMap）。由于这种翻译，以下行为是被禁止的：

在 yield 旁边使用命令式语句来确定将产生什么。
使用多个收益

（更不用说您提出的版本将返回一个 List[Any] 因为元组和 1-gram 都是不同的类型。您可能想要获得一个 List[ List[Int]] 代替）

请尝试以下操作（将 n 元语法按其出现的顺序排列）：

val basis = List(1,2,3,4)
val slidingIterators = 1 to 4 map (basis sliding _)

for {onegram <- basis
     ngram <- slidingIterators if ngram.hasNext}
     yield (ngram.next)

或者

val basis = List(1,2,3,4)
val slidingIterators = 1 to 4 map (basis sliding _)
val first=slidingIterators head
val buf=new ListBuffer[List[Int]]

while (first.hasNext)
   for (i <- slidingIterators)
      if (i.hasNext)
         buf += i.next

如果您更喜欢 n 元语法按长度顺序，请尝试：

val basis = List(1,2,3,4)
1 to 4 flatMap { basis sliding _ toList }

You can't have multiple yields in a for loop because for loops are syntactic sugar for the map (or flatMap) operations:

for (i <- collection) yield( func(i) )

translates into

collection map {i => func(i)}

Without a yield at all

for (i <- collection) func(i)

translates into

collection foreach {i => func(i)}

So the entire body of the for loop is turned into a single closure, and the presence of the yield keyword determines whether the function called on the collection is map or foreach (or flatMap). Because of this translation, the following are forbidden:

Using imperative statements next to a yield to determine what will be yielded.
Using multiple yields

(Not to mention that your proposed verison will return a List[Any] because the tuples and the 1-gram are all of different types. You probably want to get a List[List[Int]] instead)

Try the following instead (which put the n-grams in the order they appear):

val basis = List(1,2,3,4)
val slidingIterators = 1 to 4 map (basis sliding _)

for {onegram <- basis
     ngram <- slidingIterators if ngram.hasNext}
     yield (ngram.next)

val basis = List(1,2,3,4)
val slidingIterators = 1 to 4 map (basis sliding _)
val first=slidingIterators head
val buf=new ListBuffer[List[Int]]

while (first.hasNext)
   for (i <- slidingIterators)
      if (i.hasNext)
         buf += i.next

If you prefer the n-grams to be in length order, try:

val basis = List(1,2,3,4)
1 to 4 flatMap { basis sliding _ toList }

回复收藏 0 原文

冷月断魂刀 2024-09-17 01:24:16

scala> val basis = List(1, 2, 3, 4)
basis: List[Int] = List(1, 2, 3, 4)

scala> val nGrams = (basis sliding 1).toList ::: (basis sliding 2).toList ::: (basis sliding 3).toList
nGrams: List[List[Int]] = ...

scala> nGrams foreach (println _)
List(1)
List(2)
List(3)
List(4)
List(1, 2)
List(2, 3)
List(3, 4)
List(1, 2, 3)
List(2, 3, 4)

scala> val basis = List(1, 2, 3, 4)
basis: List[Int] = List(1, 2, 3, 4)

scala> val nGrams = (basis sliding 1).toList ::: (basis sliding 2).toList ::: (basis sliding 3).toList
nGrams: List[List[Int]] = ...

scala> nGrams foreach (println _)
List(1)
List(2)
List(3)
List(4)
List(1, 2)
List(2, 3)
List(3, 4)
List(1, 2, 3)
List(2, 3, 4)

回复收藏 0 原文

奈何桥上唱咆哮 2024-09-17 01:24:16

我想我应该多考虑一下。

def trigrams(tokens : Seq[T]) : Seq[(Option[T],Option[T],T)] = {
  var t1 : Option[T] = None
  var t2 : Option[T] = None
  for (t3 <- tokens)
    yield {
      val tri = (t1,t2,t3)
      t1 = t2
      t2 = Some(t3)
      tri
    }
}

然后从三元组中提取一元组和二元组。但是谁能向我解释为什么不允许“多重收益”，以及是否有其他方法可以达到其效果？

I guess I should have given this more thought.

def trigrams(tokens : Seq[T]) : Seq[(Option[T],Option[T],T)] = {
  var t1 : Option[T] = None
  var t2 : Option[T] = None
  for (t3 <- tokens)
    yield {
      val tri = (t1,t2,t3)
      t1 = t2
      t2 = Some(t3)
      tri
    }
}

Then extract the unigrams and bigrams from the trigrams. But can anyone explain to me why 'multi-yields' are not permitted, and if there's any other way to achieve their effect?

回复收藏 0 原文

心意如水 2024-09-17 01:24:16

val basis = List(1, 2, 3, 4)
val nGrams = basis.map(x => (x)) ::: (for (a <- basis; b <- basis) yield (a, b)) ::: (for (a <- basis; b <- basis; c <- basis) yield (a, b, c))
nGrams: List[Any] = ...
nGrams foreach (println(_))
1
2
3
4
(1,1)
(1,2)
(1,3)
(1,4)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(3,3)
(3,4)
(4,1)
(4,2)
(4,3)
(4,4)
(1,1,1)
(1,1,2)
(1,1,3)
(1,1,4)
(1,2,1)
(1,2,2)
(1,2,3)
(1,2,4)
(1,3,1)
(1,3,2)
(1,3,3)
(1,3,4)
(1,4,1)
(1,4,2)
(1,4,3)
(1,4,4)
(2,1,1)
(2,1,2)
(2,1,3)
(2,1,4)
(2,2,1)
(2,2,2)
(2,2,3)
(2,2,4)
(2,3,1)
(2,3,2)
(2,3,3)
(2,3,4)
(2,4,1)
(2,4,2)
(2,4,3)
(2,4,4)
(3,1,1)
(3,1,2)
(3,1,3)
(3,1,4)
(3,2,1)
(3,2,2)
(3,2,3)
(3,2,4)
(3,3,1)
(3,3,2)
(3,3,3)
(3,3,4)
(3,4,1)
(3,4,2)
(3,4,3)
(3,4,4)
(4,1,1)
(4,1,2)
(4,1,3)
(4,1,4)
(4,2,1)
(4,2,2)
(4,2,3)
(4,2,4)
(4,3,1)
(4,3,2)
(4,3,3)
(4,3,4)
(4,4,1)
(4,4,2)
(4,4,3)
(4,4,4)

val basis = List(1, 2, 3, 4)
val nGrams = basis.map(x => (x)) ::: (for (a <- basis; b <- basis) yield (a, b)) ::: (for (a <- basis; b <- basis; c <- basis) yield (a, b, c))
nGrams: List[Any] = ...
nGrams foreach (println(_))
1
2
3
4
(1,1)
(1,2)
(1,3)
(1,4)
(2,1)
(2,2)
(2,3)
(2,4)
(3,1)
(3,2)
(3,3)
(3,4)
(4,1)
(4,2)
(4,3)
(4,4)
(1,1,1)
(1,1,2)
(1,1,3)
(1,1,4)
(1,2,1)
(1,2,2)
(1,2,3)
(1,2,4)
(1,3,1)
(1,3,2)
(1,3,3)
(1,3,4)
(1,4,1)
(1,4,2)
(1,4,3)
(1,4,4)
(2,1,1)
(2,1,2)
(2,1,3)
(2,1,4)
(2,2,1)
(2,2,2)
(2,2,3)
(2,2,4)
(2,3,1)
(2,3,2)
(2,3,3)
(2,3,4)
(2,4,1)
(2,4,2)
(2,4,3)
(2,4,4)
(3,1,1)
(3,1,2)
(3,1,3)
(3,1,4)
(3,2,1)
(3,2,2)
(3,2,3)
(3,2,4)
(3,3,1)
(3,3,2)
(3,3,3)
(3,3,4)
(3,4,1)
(3,4,2)
(3,4,3)
(3,4,4)
(4,1,1)
(4,1,2)
(4,1,3)
(4,1,4)
(4,2,1)
(4,2,2)
(4,2,3)
(4,2,4)
(4,3,1)
(4,3,2)
(4,3,3)
(4,3,4)
(4,4,1)
(4,4,2)
(4,4,3)
(4,4,4)

回复收藏 0 原文

笑着哭最痛 2024-09-17 01:24:16

您可以尝试没有分配的功能版本：

def trigrams[T](tokens : Seq[T]) = {
  val s1 = tokens.map { Some(_) }
  val s2 = None +: s1
  val s3 = None +: s2
  s1 zip s2 zip s3 map {
    case ((t1, t2), t3) => (List(t1), List(t1, t2), List(t1, t2, t3))
  }
}

You could try a functional version without assignments:

def trigrams[T](tokens : Seq[T]) = {
  val s1 = tokens.map { Some(_) }
  val s2 = None +: s1
  val s3 = None +: s2
  s1 zip s2 zip s3 map {
    case ((t1, t2), t3) => (List(t1), List(t1, t2), List(t1, t2, t3))
  }
}

回复收藏 0 原文

~没有更多了~