位图向量 trie 为何比普通向量更快?
它据说比向量更快,但我真的不这么认为了解引用的局部性应该如何帮助实现这一点(因为根据定义,向量是可能的最局部打包的数据——每个元素都紧挨着后继元素打包,之间没有额外的空间)。
基准测试是否假设特定的使用模式或类似的东西?
这怎么可能?
It's supposedly faster than a vector, but I don't really understand how locality of reference is supposed to help this (since a vector is by definition the most locally packed data possible -- every element is packed next to the succeeding element, with no extra space between).
Is the benchmark assuming a specific usage pattern or something similar?
How this is possible?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
位图向量尝试严格来说并不比普通向量快,至少不是在所有方面都快。这取决于您正在考虑进行什么操作。
例如,传统向量在访问特定索引处的数据元素时速度更快。很难击败直接索引数组查找。从缓存局部性的角度来看,如果您所做的只是按顺序循环遍历大数组,那么大数组就非常好。
然而,位图向量 trie 对于其他操作来说会更快(由于结构共享) - 例如,在不影响原始数据结构的情况下,使用单个更改元素创建新副本的时间复杂度为 O(log32 n),而对于传统向量。这是一个巨大的胜利。
这是一个关于该主题的精彩视频,非常值得观看,其中包含了许多为什么您可能希望在您的语言中使用此类结构的动机:持久数据结构和托管引用(Rich Hickey 的演讲)。
bitmapped vector tries aren't strictly faster than normal vectors, at least not at everything. It depends on what operation you are considering.
Conventional vectors are faster, for example, at accessing a data element at a specific index. It's hard to beat a straight indexed array lookup. And from a cache locality perspective, big arrays are pretty good if all you are doing is looping over them sequentially.
However a bitmapped vector trie will be much faster for other operations (thanks to structural sharing) - for example creating a new copy with a single changed element without affecting the original data structure is O(log32 n) vs. O(n) for a traditional vector. That's a huge win.
Here's an excellent video well worth watching on the topic, which includes a lot of the motivation of why you might want these kind of structures in your language: Persistent Data Structures and Managed References (talk by Rich Hickey).
其他答案中有很多好东西,但没有人回答你的问题。 PersistenVector 仅对于按索引进行大量随机查找(当数组很大时)才快速。 “怎么可能?”你可能会问。 “普通的平面数组只需要移动一个指针,PersistentVector 必须经过多个步骤。”
答案是“缓存局部性”。
缓存总是从内存中获取一个范围。如果你有一个大数组,它不适合缓存。因此,如果您想获取项目 x 和项目 y,则必须重新加载整个缓存。这是因为数组在内存中始终是连续的。
现在有了 PVector,情况就不同了。有很多浮动的小数组,JVM 对此很聪明,并将它们放在内存中彼此靠近的位置。所以对于随机访问来说这很快;如果你按顺序运行它会慢得多。
我不得不说,我不是硬件或 JVM 如何处理缓存局部性方面的专家,而且我自己也从未对此进行过基准测试;我只是重述我从其他人那里听到的东西:)
编辑:mikera 也提到了这一点。
编辑 2:请参阅有关函数式数据结构的讨论,如果您只对向量感兴趣,请跳到最后一部分。 http://www.infoq.com/presentations/Functional-Data-Structures -在Scala中
There is a lot of good stuff in the other answers but nobdy answers your question. The PersistenVectors are only fast for lots of random lookups by index (when the array is big). "How can that be?" you might ask. "A normal flat array only needs to move a pointer, the PersistentVector has to go through multiple steps."
The answer is "Cache Locality".
The cache always gets a range from memory. If you have a big array it does not fit the cache. So if you want to get item x and item y you have to reload the whole cache. That's because the array is always sequential in memory.
Now with the PVector that's diffrent. There are lots of small arrays floating around and the JVM is smart about that and puts them close to each other in memory. So for random accesses this is fast; if you run through it sequentially it's much slower.
I have to say that I'm not an expert on hardware or how the JVM handles cache locality and I have never benchmarked this myself; I am just retelling stuff I've heard from other people :)
Edit: mikera mentions that too.
Edit 2: See this talk about Functional Data-Structures, skip to the last part if you are only intrested in the vector. http://www.infoq.com/presentations/Functional-Data-Structures-in-Scala
位图向量 trie(又名持久向量)是 Rich Hickey 为 Clojure 发明的一种数据结构,自 2010 年以来已在 Scala 中实现(v 2.8)。正是其巧妙的按位索引策略允许高效地访问和修改大型数据集。
来自了解 Clojure 的持久向量:
更多信息
除了了解 Clojure 的持久向量 ,这种数据结构背后的想法及其用例在 David Nolen 2014 年的演讲中也得到了很好的解释不变性、交互性和安全性JavaScript,下面的屏幕截图就是从中获取的。或者,如果您确实想深入了解技术细节,另请参阅 Phil Bagwell 的 Ideal Hash Trees,这是 Hickey 最初的 Clojure 实现所基于的论文。
A bitmapped vector trie (aka a persistent vector) is a data structure invented by Rich Hickey for Clojure, that has been implementated in Scala since 2010 (v 2.8). It is its clever bitwise indexing strategy that allows for highly efficient access and modification of large data sets.
From Understanding Clojure's Persistent Vectors :
More info
Besides Understanding Clojure's Persistent Vectors, the ideas behind this data structure and its use cases are also explained pretty well in David Nolen's 2014 lecture Immutability, interactivity & JavaScript, from which the screenshot below was taken. Or if you really want to dive deeply into the technical details, see also Phil Bagwell's Ideal Hash Trees, which was the paper upon which Hickey's initial Clojure implementation was based.
“普通向量”是什么意思?只是一个平面的项目数组?如果您从不更新它,那很好,但如果您更改了 1M 元素的平面向量,则必须进行大量复制;树的存在是为了让您共享大部分结构。
What do you mean by "plain vector"? Just a flat array of items? That's great if you never update it, but if you ever change a 1M-element flat-vector you have to do a lot of copying; the tree exists to allow you to share most of the structure.
简短的解释:它利用了 JVM 在读/写/复制数组数据结构上如此努力优化的事实。 IMO 的关键方面是,如果您的向量增长到一定大小,则索引管理将成为瓶颈。这里出现了来自持久向量的非常聪明的算法,在非常大的集合上它的性能优于标准变体。所以基本上它是一个函数式数据结构,它的性能之所以如此好,是因为它是建立在小型可变的高度优化的 JVM 数据结构上的。
更详细的内容请看这里(最后)
http://topsy.com/vimeo.com/28760673
Short explanation: it uses the fact that the JVM optimizes so hard on read/write/copy array data structures. The key aspect IMO is that if your vector grows to a certain size index management becomes a bottleneck . Here comes the very clever algorithm from persisted vector into play, on very large collections it outperforms the standard variant. So basically it is a functional data-structure which only performed so well because it is built up on small mutable highly optimizes JVM datastructures.
For further details see here (at the end)
http://topsy.com/vimeo.com/28760673
从演讲的标题来看,它谈论的是 Scala 向量,它们甚至与“可能的最本地打包数据”相距甚远:请参阅 https://lampsvn.epfl.ch/trac/scala/browser/scala/tags/R_2_9_1_final/src/library/scala/collection/immutable/Vector.scala。
你的定义仅适用于Lisps(据我所知)。
Judging by the title of the talk, it's talking about Scala vectors, which aren't even close to "the most locally packed data possible": see source at https://lampsvn.epfl.ch/trac/scala/browser/scala/tags/R_2_9_1_final/src/library/scala/collection/immutable/Vector.scala.
Your definition only applies to Lisps (as far as I know).