当前位置：文江博客话题详情

掌握不可变数据结构

发布于 2024-12-19 05:06:08 字数 873 浏览 0 评论 0原文

我正在学习 scala，作为一名好学生，我尝试遵守我发现的所有规则。

一条规则是：不变性！！！

所以我尝试用不可变的数据结构和值来编写所有内容，有时这真的很难。

但今天我心想：唯一重要的是对象/类不应该有可变的状态。我不必被迫以不可变的风格编写所有方法，因为这些方法不会相互影响。

我的问题：我是否正确，或者是否有任何我没有看到的问题/缺点？

编辑：

aishwarya 的代码示例：

def logLikelihood(seq: Iterator[T]): Double = {
  val sequence = seq.toList
  val stateSequence = (0 to order).toList.padTo(sequence.length,order)
  val seqPos = sequence.zipWithIndex

  def probOfSymbAtPos(symb: T, pos: Int) : Double = {
    val state = states(stateSequence(pos))
    M.log(state( seqPos.map( _._1 ).slice(0, pos).takeRight(order), symb))
  }

  val probs = seqPos.map( i => probOfSymbAtPos(i._1,i._2) )

  probs.sum
}

说明：这是一种计算变量阶齐次马尔可夫模型的对数似然的方法。 state 的 apply 方法获取所有先前的符号和即将到来的符号，并返回这样做的概率。

正如您所看到的：整个方法只是乘以一些概率，使用变量会更容易。

原文

I am learning scala and as a good student I try to obey all rules I found.

One rule is: IMMUTABILITY!!!

So I have tried to code everything with immutable data structures and vals, and sometimes this is really hard.

But today I thought to myself: the only important thing is that the object/class should have no mutable state. I am not forced to code all methods in an immutable style, because these methods don't affect each other.

My Question: Am I correct or are there any problems/disadvantages I dont see?

EDIT:

Code example for aishwarya:

def logLikelihood(seq: Iterator[T]): Double = {
  val sequence = seq.toList
  val stateSequence = (0 to order).toList.padTo(sequence.length,order)
  val seqPos = sequence.zipWithIndex

  def probOfSymbAtPos(symb: T, pos: Int) : Double = {
    val state = states(stateSequence(pos))
    M.log(state( seqPos.map( _._1 ).slice(0, pos).takeRight(order), symb))
  }

  val probs = seqPos.map( i => probOfSymbAtPos(i._1,i._2) )

  probs.sum
}

Explanation: It is a method to calculate the log-likelihood of a homogeneous Markov model of variable order. The apply method of state takes all previous symbols and the coming symbol and returns the probability of doing so.

As you may see: the whole method is just multiplying some probabilities which would be much easier using vars.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

空城仅有旧梦在 2024-12-26 05:06:08

该规则并不是真正的不变性，而是引用透明度。使用本地声明的可变变量和数组是完全可以的，因为整个程序的任何其他部分都无法观察到任何影响。

引用透明 (RT) 的原则是这样的：

如果对于所有程序 p 每次出现 e，则表达式 e 是引用透明 p 中的 可以替换为 e 的计算结果，而不影响 p 的可观察结果。

请注意，如果 e 创建并改变某些本地状态，它不会违反 RT，因为没有人可以观察到这种情况的发生。

也就是说，我非常怀疑您的 vars 实现是否更加简单。

回复收藏 0 原文

感悟人生的甜 2024-12-26 05:06:08

函数式编程的案例之一是代码简洁并引入更数学的方法。它可以减少出现错误的可能性，并使您的代码更小、更具可读性。至于变得容易与否，确实需要你以不同的方式思考你的问题。但是，一旦您习惯了使用函数式模式进行思考，函数式模式可能会变得比命令式风格更容易。

实现完美功能并具有零可变状态确实很困难，但拥有最小可变状态非常有益。需要记住的是，一切都需要平衡地完成，而不是走极端。通过减少可变状态的数量，最终会导致编写产生意外后果的代码变得更加困难。一种常见的模式是拥有一个其值不可变的可变变量。这样身份（命名变量）和值（可以分配变量的不可变对象）是分开的。

var acc: List[Int] = Nil
// lots of complex stuff that adds values
acc ::= 1
acc ::= 2
acc ::= 3
// do loop current list
acc foreach { i => /* do stuff that mutates acc */ acc ::= i * 10 }
println( acc ) // List( 1, 2, 3, 10, 20, 30 )

foreach 正在循环我们启动 foreach 时的 acc 值。 acc 的任何突变都不会影响循环。这比 java 中的典型迭代器安全得多，其中列表可以在迭代中更改。

还有一个并发问题。 由于 JSR-133 内存模型规范，不可变对象非常有用，该规范断言对象最终成员的初始化将在任何线程对这些成员具有可见性之前发生，就这样！如果它们不是最终的，那么它们是“可变的”，并且不能保证正确的初始化。

Actor 是放置可变状态的完美场所。表示数据的对象应该是不可变的。以下面的例子为例。

object MyActor extends Actor {
  var acc: List[Int] = Nil
  def act() {
    loop {
      react {
        case i: Int => acc ::= i
        case "what is your current value" => reply( acc )
        case _ => // ignore all other messages
      }
    }
  }
}

在这种情况下，我们可以发送 acc 的值（它是一个 List ），并且不用担心同步，因为 List 是不可变的，即 List 对象的所有成员都是最终的。另外，由于不变性，我们知道没有其他参与者可以更改发送的底层数据结构，因此没有其他参与者可以更改该参与者的可变状态。

The case for functional programming is one of being concise in your code and bringing in a more mathematical approach. It can reduce the possibility of bugs and make your code smaller and more readable. As for being easier or not, it does require that you think about your problems differently. But once you get use to thinking with functional patterns it's likely that functional will become easier that the more imperative style.

It is really hard to be perfectly functional and have zero mutable state but very beneficial to have minimal mutable state. The thing to remember is that everything needs to done in balance and not to the extreme. By reducing the amount of mutable state you end up making it harder to write code with unintended consequences. A common pattern is to have a mutable variable whose value is immutable. This way identity ( the named variable ) and value ( an immutable object the variable can be assigned ) are seperate.

var acc: List[Int] = Nil
// lots of complex stuff that adds values
acc ::= 1
acc ::= 2
acc ::= 3
// do loop current list
acc foreach { i => /* do stuff that mutates acc */ acc ::= i * 10 }
println( acc ) // List( 1, 2, 3, 10, 20, 30 )

The foreach is looping over the value of acc at the time we started the foreach. Any mutations to acc do not affect the loop. This is much safer than the typical iterators in java where the list can change mid iteration.

There is also a concurrency concern. Immutable objects are useful because of the JSR-133 memory model specification which asserts that the initialization of an objects final members will occur before any thread can have visibility to those members, period! If they are not final then they are "mutable" and there is no guarantee of proper initialization.

Actors are the perfect place to put mutable state. Objects that represent data should be immutable. Take the following example.

object MyActor extends Actor {
  var acc: List[Int] = Nil
  def act() {
    loop {
      react {
        case i: Int => acc ::= i
        case "what is your current value" => reply( acc )
        case _ => // ignore all other messages
      }
    }
  }
}

In this case we can send the value of acc ( which is a List ) and not worry about synchronization because List is immutable aka all of the members of the List object are final. Also because of the immutability we know that no other actor can change the underlying data structure that was sent and thus no other actor can change the mutable state of this actor.

回复收藏 0 原文

彩虹直至黑白 2024-12-26 05:06:08

由于 Apocalisp 已经提到< /a> 我要引用他的内容，我将讨论代码。你说它只是乘以东西，但我没有看到这一点——它引用了至少三个外部定义的重要方法：order、states和M.log。我可以推断 order 是一个 Int，并且 states 返回一个采用 List[T] 的函数和一个 T 并返回 Double。

还有一些奇怪的事情发生...

def logLikelihood(seq: Iterator[T]): Double = {
  val sequence = seq.toList

除了定义 seqPos 之外，从未使用过序列，那么为什么要这样做呢？

  val stateSequence = (0 to order).toList.padTo(sequence.length,order)
  val seqPos = sequence.zipWithIndex

  def probOfSymbAtPos(symb: T, pos: Int) : Double = {
    val state = states(stateSequence(pos))
    M.log(state( seqPos.map( _._1 ).slice(0, pos).takeRight(order), symb))

实际上，您可以在此处使用 sequence 而不是 seqPos.map( _._1 )，因为所做的只是撤消 zipWithIndex。另外，slice(0, pos) 只是take(pos)。

  }

  val probs = seqPos.map( i => probOfSymbAtPos(i._1,i._2) )

  probs.sum
}

现在，考虑到缺少的方法，很难断言应该如何以函数式风格编写它。保留神秘方法会产生：

def logLikelihood(seq: Iterator[T]): Double = {
  import scala.collection.immutable.Queue
  case class State(index: Int, order: Int, slice: Queue[T], result: Double)

  seq.foldLeft(State(0, 0, Queue.empty, 0.0)) {
    case (State(index, ord, slice, result), symb) =>
      val state = states(order)
      val partial = M.log(state(slice, symb))
      val newSlice = slice enqueue symb
      State(index + 1, 
            if (ord == order) ord else ord + 1, 
            if (queue.size > order) newSlice.dequeue._2 else newSlice,
            result + partial)
  }.result
}

只有我怀疑 state/M.log 内容也可以成为 State 的一部分。既然我已经这样写了，我注意到其他优化。当然，您使用的滑动窗口让我想起了滑动：

seq.sliding(order).zipWithIndex.map { 
  case (slice, index) => M.log(states(index + order)(slice.init, slice.last))
}.sum

它只会从第一个元素开始，因此需要进行一些调整。不过，并不是太难。所以让我们再重写一次：

def logLikelihood(seq: Iterator[T]): Double = {
  val sequence = seq.toList
  val slices = (1 until order).map(sequence take) ::: sequence.sliding(order)
  slices.zipWithIndex.map { 
    case (slice, index) => M.log(states(index)(slice.init, slice.last))
  }.sum
}

我希望我能看到 M.log 和 states...我打赌我可以把 map 变成 < code>foldLeft 并取消这两个方法。我怀疑 states 返回的方法可以采用整个切片而不是两个参数。

不过……还不错吧？

Since Apocalisp has already mentioned the stuff I was going to quote him on, I'll discuss the code. You say it is just multiplying stuff, but I don't see that -- it makes reference to at least three important methods defined outside: order, states and M.log. I can infer that order is an Int, and that states return a function that takes a List[T] and a T and returns Double.

There's also some weird stuff going on...

def logLikelihood(seq: Iterator[T]): Double = {
  val sequence = seq.toList

sequence is never used except to define seqPos, so why do that?

  val stateSequence = (0 to order).toList.padTo(sequence.length,order)
  val seqPos = sequence.zipWithIndex

  def probOfSymbAtPos(symb: T, pos: Int) : Double = {
    val state = states(stateSequence(pos))
    M.log(state( seqPos.map( _._1 ).slice(0, pos).takeRight(order), symb))

Actually, you could use sequence here instead of seqPos.map( _._1 ), since all that does is undo the zipWithIndex. Also, slice(0, pos) is just take(pos).

  }

  val probs = seqPos.map( i => probOfSymbAtPos(i._1,i._2) )

  probs.sum
}

Now, given the missing methods, it is difficult to assert how this should really be written in functional style. Keeping the mystery methods would yield:

def logLikelihood(seq: Iterator[T]): Double = {
  import scala.collection.immutable.Queue
  case class State(index: Int, order: Int, slice: Queue[T], result: Double)

  seq.foldLeft(State(0, 0, Queue.empty, 0.0)) {
    case (State(index, ord, slice, result), symb) =>
      val state = states(order)
      val partial = M.log(state(slice, symb))
      val newSlice = slice enqueue symb
      State(index + 1, 
            if (ord == order) ord else ord + 1, 
            if (queue.size > order) newSlice.dequeue._2 else newSlice,
            result + partial)
  }.result
}

Only I suspect the state/M.log stuff could be made part of State as well. I notice other optimizations now that I have written it like this. The sliding window you are using reminds me, of course, of sliding:

seq.sliding(order).zipWithIndex.map { 
  case (slice, index) => M.log(states(index + order)(slice.init, slice.last))
}.sum

That will only start at the orderth element, so some adaptation would be in order. Not too difficult, though. So let's rewrite it again:

def logLikelihood(seq: Iterator[T]): Double = {
  val sequence = seq.toList
  val slices = (1 until order).map(sequence take) ::: sequence.sliding(order)
  slices.zipWithIndex.map { 
    case (slice, index) => M.log(states(index)(slice.init, slice.last))
  }.sum
}

I wish I could see M.log and states... I bet I could turn that map into a foldLeft and do away with these two methods. And I suspect the method returned by states could take the whole slice instead of two parameters.

Still... not bad, is it?

回复收藏 0 原文

~没有更多了~