帮助我了解 Clojure 中如何处理不变性和运行时间之间的冲突

发布于 2024-09-18 21:26:44 字数 1172 浏览 8 评论 0原文

Clojure 确实引起了我的兴趣,我开始学习它的教程: http://java.ociweb.com/mark/clojure/article.html

考虑“Set”下提到的这两行:

(def stooges (hash-set "Moe" "Larry" "Curly")) ; not sorted
(def more-stooges (conj stooges "Shemp")) ; -> #{"Moe" "Larry" "Curly" "Shemp"}

我的第一个想法是第二个操作应该花费恒定的时间来完成;否则,函数式语言可能比面向对象的语言没有什么好处。人们很容易想象需要从[几乎]空的集合开始,然后填充它并随着我们的进展缩小它。因此,我们可以将新结果重新分配给自己,而不是将新结果分配给 more-sooges。

现在,由于函数式语言的奇妙承诺,副作用不再需要担心。因此,集合 stoogesmore-stooges 不应该相互叠加工作。因此,要么创建 more-stooges 是一个线性操作,要么它们共享一个公共缓冲区(如 Java 的 StringBuffer),这看起来是一个非常糟糕的主意,并且与不变性(随后 stooges 可以逐一删除一个元素)。

我可能在这里重新发明了一个轮子。当您从最大数量的元素开始,然后一次删除它们直到空集时,hash-set 似乎在 clojure 中性能更高从一个空集开始,然后一次增加一个。

上面的例子可能看起来不太实用,或者有解决方法,但是像 Java/C#/Python/ 等面向对象的语言。一次增加或缩小一组元素或几个元素都没有问题,同时速度也很快。

保证(或只是承诺?)不变性的[功能]语言将无法快速增长集合。是否还有另一种习惯用法可以帮助避免这样做?

对于熟悉 Python 的人,我会提到集合理解与等效循环方法。两者的运行时间略有不同,但这与 C、Python 和解释器的相对速度有关,而不是源于复杂性。我看到的问题是,集合理解通常是一种更好的方法,但并不总是最好的方法,因为可读性可能会受到很大影响。

如果问题不清楚,请告诉我。

Clojure truly piqued my interest, and I started going through a tutorial on it:
http://java.ociweb.com/mark/clojure/article.html

Consider these two lines mentioned under "Set":

(def stooges (hash-set "Moe" "Larry" "Curly")) ; not sorted
(def more-stooges (conj stooges "Shemp")) ; -> #{"Moe" "Larry" "Curly" "Shemp"}

My first thought was that the second operation should take constant time to complete; otherwise functional language might have little benefit over an object-oriented one. One can easily imagine a need to start with [nearly] empty set, and populate it and shrink it as we go along. So, instead of assigning the new result to more-stooges, we could re-assign it to itself.

Now, by the marvelous promise of functional languages, side effects are not to be concerned with. So, sets stooges and more-stooges should not work on top of each other ever. So, either the creation of more-stooges is a linear operation, or they share a common buffer (like Java's StringBuffer) which would seem like a very bad idea and conflict with immutability (subsequently stooges can drop an element one-by-one).

I am probably reinventing a wheel here. it seems like the hash-set would be more performant in clojure when you start with the maximum number of elements and then remove them one at a time until empty set as oppose to starting with an empty set and growing it one at a time.

The examples above might not seem terribly practical, or have workarounds, but the object-oriented language like Java/C#/Python/etc. has no problem with either growing or shrinking a set one or few elements at a time while also doing it fast.

A [functional] language which guarantees(or just promises?) immutability would not be able to grow a set as fast. Is there another idiom that one can use which somehow can help avoiding doing that?

For someone familiar with Python, I would mention set comprehension versus an equivalent loop approach. The running time of the two is tiny bit different, but that has to do with relative speeds of C, Python, interpreter and not rooted in complexity. The problem I see is that set comprehension is often a better approach, but NOT ALWAYS the best approach, for the readability might suffer a great deal.

Let me know if the question is not clear.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

网白 2024-09-25 21:26:46

对我来说,核心的不可变数据结构也是该语言最迷人的部分之一。他们对于回答这个问题有很多内容,Rich 在这个视频中做得非常出色:

http://blip。 tv/file/707974

核心数据结构:

  • 实际上是完全不可变的
  • 旧副本也是不可变的
  • 性能不会因为旧副本而降低
  • 访问是常量(实际上有界 <= 常量)
  • 所有支持高效的追加、连接(列表和序列除外)和砍伐

他们是如何做到这一点的???

  • 秘密:引擎盖下几乎所有树(实际上是一个特里树)。

但如果我真的想就地编辑某些内容怎么办?

  • 您可以使用 clojure 的 transients 就地编辑结构,然后在您需要时生成不可变版本(在恒定时间内)准备分享它。

作为一个小背景: Trie 是一棵树,其中键的所有公共元素都被提升起来到树顶。 clojure 中的集合和映射使用 trie,其中索引是您要查找的键的哈希值。然后,它将散列分解成小块,并使用每个块作为散列特里树的一级的密钥。这允许共享新旧映射的公共部分,并且访问时间受到限制,因为输入中使用的散列具有固定大小,因此只能有固定数量的分支。

使用这些哈希尝试还有助于防止许多其他持久数据结构使用的重新平衡过程中出现大幅减速。所以你实际上会得到相当恒定的挂钟访问时间。

我真的推荐这本书(相对较短):纯函数式数据结构
在其中,他涵盖了许多非常有趣的结构和概念,例如“消除摊销”以允许队列的真正恒定时间访问。以及惰性持久队列之类的东西。作者甚至在 此处为 PDF

The core Immutable data structures are one of the most fascinating parts of the language for me also. their is a lot to answering this question and Rich does a really great job of it in this video:

http://blip.tv/file/707974

The core data structures:

  • are actually fully immutable
  • the old copies are also immutable
  • performance does not degrade for the old copies
  • access is constant (actually bounded <= a constant)
  • all support efficient appending, concatenating (except lists and seqs) and chopping

How do they do this???

  • the secret: it's pretty much all trees under the hood (actually a trie).

But what if i really want to edit somthing in place?

  • you can use clojure's transients to edit a structure in place and then produce a immutable version (in constant time) when you are ready to share it.

as a little background: a Trie is a tree where all the common elements of the key are hoisted up to the top of the tree. the sets and maps in clojure use trie where the indexes are a hash of the key you are looking for. it then breaks the hash up into small chunks and uses each chunk as the key to one level of the hash-trie. This allows the common parts of both the new and old maps to be shared and the access time is bounded because there can only be a fixed number of branches because the hash used as in input has a fixed size.

Using these hash tries also helps prevent big slowdowns during the re-balancing used by a lot of other persistent data structures. so you will actually get fairly constant wall-clock-access-time.

I really reccomend the (relativly short)_ book: Purely Functional Data Structures
In it he covers a lot of really interesting structures and concepts like "removing amortization" to allow true constant time access for queues. and things like lazy-persistent queues. the author even offers a free copy in pdf here

墨小沫ゞ 2024-09-25 21:26:46

Clojure 的数据结构是持久,这意味着它们是不可变的,但使用结构共享来支持有效的“修改”。请参阅 Clojure 文档中关于不可变数据结构的部分以获得更全面的解释。它特别指出

具体来说,这意味着无法使用完整副本创建新版本,因为这需要线性时间。不可避免地,持久集合是使用链接数据结构来实现的,以便新版本可以与先前版本共享结构。

这些 帖子,以及一些 Rich Hickey 的演讲,很好地概述了持久数据结构的实现。

Clojure's data structures are persistent, which means that they are immutable but use structural sharing to support efficient "modifications". See the section on immutable data structures in the Clojure docs for a more thorough explanation. In particular, it states

Specifically, this means that the new version can't be created using a full copy, since that would require linear time. Inevitably, persistent collections are implemented using linked data structures, so that the new versions can share structure with the prior version.

These posts, as well as some of Rich Hickey's talks, give a good overview of the implementation of persistent data structures.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文