帮助我了解 Clojure 中如何处理不变性和运行时间之间的冲突

发布于 2024-09-18 21:26:44 字数 1172 浏览 14 评论 0原文

Clojure 确实引起了我的兴趣，我开始学习它的教程： http://java.ociweb.com/mark/clojure/article.html

考虑“Set”下提到的这两行：

(def stooges (hash-set "Moe" "Larry" "Curly")) ; not sorted
(def more-stooges (conj stooges "Shemp")) ; -> #{"Moe" "Larry" "Curly" "Shemp"}

我的第一个想法是第二个操作应该花费恒定的时间来完成；否则，函数式语言可能比面向对象的语言没有什么好处。人们很容易想象需要从[几乎]空的集合开始，然后填充它并随着我们的进展缩小它。因此，我们可以将新结果重新分配给自己，而不是将新结果分配给 more-sooges。

现在，由于函数式语言的奇妙承诺，副作用不再需要担心。因此，集合 stooges 和 more-stooges 不应该相互叠加工作。因此，要么创建 more-stooges 是一个线性操作，要么它们共享一个公共缓冲区（如 Java 的 StringBuffer），这看起来是一个非常糟糕的主意，并且与不变性（随后 stooges 可以逐一删除一个元素）。

我可能在这里重新发明了一个轮子。当您从最大数量的元素开始，然后一次删除它们直到空集时，hash-set 似乎在 clojure 中性能更高从一个空集开始，然后一次增加一个。

上面的例子可能看起来不太实用，或者有解决方法，但是像 Java/C#/Python/ 等面向对象的语言。一次增加或缩小一组元素或几个元素都没有问题，同时速度也很快。

保证（或只是承诺？）不变性的[功能]语言将无法快速增长集合。是否还有另一种习惯用法可以帮助避免这样做？

对于熟悉 Python 的人，我会提到集合理解与等效循环方法。两者的运行时间略有不同，但这与 C、Python 和解释器的相对速度有关，而不是源于复杂性。我看到的问题是，集合理解通常是一种更好的方法，但并不总是最好的方法，因为可读性可能会受到很大影响。

如果问题不清楚，请告诉我。

原文

Clojure truly piqued my interest, and I started going through a tutorial on it:
http://java.ociweb.com/mark/clojure/article.html

Consider these two lines mentioned under "Set":

(def stooges (hash-set "Moe" "Larry" "Curly")) ; not sorted
(def more-stooges (conj stooges "Shemp")) ; -> #{"Moe" "Larry" "Curly" "Shemp"}

My first thought was that the second operation should take constant time to complete; otherwise functional language might have little benefit over an object-oriented one. One can easily imagine a need to start with [nearly] empty set, and populate it and shrink it as we go along. So, instead of assigning the new result to more-stooges, we could re-assign it to itself.

Now, by the marvelous promise of functional languages, side effects are not to be concerned with. So, sets stooges and more-stooges should not work on top of each other ever. So, either the creation of more-stooges is a linear operation, or they share a common buffer (like Java's StringBuffer) which would seem like a very bad idea and conflict with immutability (subsequently stooges can drop an element one-by-one).

I am probably reinventing a wheel here. it seems like the hash-set would be more performant in clojure when you start with the maximum number of elements and then remove them one at a time until empty set as oppose to starting with an empty set and growing it one at a time.

The examples above might not seem terribly practical, or have workarounds, but the object-oriented language like Java/C#/Python/etc. has no problem with either growing or shrinking a set one or few elements at a time while also doing it fast.

A [functional] language which guarantees(or just promises?) immutability would not be able to grow a set as fast. Is there another idiom that one can use which somehow can help avoiding doing that?

For someone familiar with Python, I would mention set comprehension versus an equivalent loop approach. The running time of the two is tiny bit different, but that has to do with relative speeds of C, Python, interpreter and not rooted in complexity. The problem I see is that set comprehension is often a better approach, but NOT ALWAYS the best approach, for the readability might suffer a great deal.

Let me know if the question is not clear.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

网白 2024-09-25 21:26:46

对我来说，核心的不可变数据结构也是该语言最迷人的部分之一。他们对于回答这个问题有很多内容，Rich 在这个视频中做得非常出色：

http://blip。 tv/file/707974

核心数据结构：

实际上是完全不可变的
旧副本也是不可变的
性能不会因为旧副本而降低
访问是常量（实际上有界 <= 常量）
所有支持高效的追加、连接（列表和序列除外）和砍伐

他们是如何做到这一点的？？？

秘密：引擎盖下几乎所有树（实际上是一个特里树）。

但如果我真的想就地编辑某些内容怎么办？

您可以使用 clojure 的 transients 就地编辑结构，然后在您需要时生成不可变版本（在恒定时间内）准备分享它。

作为一个小背景： Trie 是一棵树，其中键的所有公共元素都被提升起来到树顶。 clojure 中的集合和映射使用 trie，其中索引是您要查找的键的哈希值。然后，它将散列分解成小块，并使用每个块作为散列特里树的一级的密钥。这允许共享新旧映射的公共部分，并且访问时间受到限制，因为输入中使用的散列具有固定大小，因此只能有固定数量的分支。

使用这些哈希尝试还有助于防止许多其他持久数据结构使用的重新平衡过程中出现大幅减速。所以你实际上会得到相当恒定的挂钟访问时间。

我真的推荐这本书（相对较短）：纯函数式数据结构
在其中，他涵盖了许多非常有趣的结构和概念，例如“消除摊销”以允许队列的真正恒定时间访问。以及惰性持久队列之类的东西。作者甚至在此处为 PDF