计算希尔伯特 R 树中使用的点的希尔伯特值?
我有一个应用程序,其中希尔伯特 R 树 (wikipedia) (citeseer) 似乎是一个合适的数据结构。 具体来说,它需要对将经历大量更新的数据集进行相当快的空间查询。
然而,据我所知,该数据结构的算法描述都没有提及如何实际计算所需的希尔伯特值; 这是沿着希尔伯特曲线到该点的距离。
那么对于如何计算这个有什么建议吗?
I have an application where a Hilbert R-Tree (wikipedia) (citeseer) would seem to be an appropriate data structure. Specifically, it requires reasonably fast spatial queries over a data set that will experience a lot of updates.
However, as far as I can see, none of the descriptions of the algorithms for this data structure even mention how to actually calculate the requisite Hilbert Value; which is the distance along a Hilbert Curve to the point.
So any suggestions for how to go about calculating this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
建议:对于空间查询来说,一个好的简单高效的数据结构是多维二叉树。
在传统的二叉树中,有一个“判别式”; 用于确定是采用左分支还是右分支的值。 这可以被认为是一维情况。
在多维二叉树中,有多个判别式; 连续的级别使用不同的判别式。 例如,对于二维空间数据,您可以使用 X 和 Y 坐标作为判别式。 连续级别将使用 X、Y、X、Y...
对于空间查询(例如查找矩形内的所有节点),您从根开始对树进行深度优先搜索,并在每个级别使用判别式以避免向下搜索给定矩形中不包含节点的分支。
这使您可以将每个级别的搜索空间减少一半,从而非常有效地在海量数据集中查找小区域。 (顺便说一句,这种数据结构对于部分匹配查询也很有用,即省略一个或多个判别式的查询。您只需在省略判别式的级别上搜索两个分支。)
关于此数据结构的一篇很好的论文:http://portal.acm.org/itation.cfm?id=361007
这篇文章有很好的图表和算法描述:http://en.wikipedia.org/wiki/Kd -树
Suggestion: A good simple efficient data structure for spatial queries is a multidimensional binary tree.
In a traditional binary tree, there is one "discriminant"; the value that's used to determine whether you take the left branch or the right branch. This can be considered to be the one-dimensional case.
In a multidimensional binary tree, you have multiple discriminants; consecutive levels use different discriminants. For example, for two dimensional spacial data, you could use the X and Y coordinates as discriminants. Consecutive levels would use X, Y, X, Y...
For spatial queries (for example finding all nodes within a rectangle) you do a depth-first search of the tree starting at the root, and you use the discriminant at each level to avoid searching down branches that contain no nodes in the given rectangle.
This allows you to potentially cut the search space in half at each level, making it very efficient for finding small regions in a massive data set. (BTW, this data structure is also useful for partial-match queries, i.e. queries that omit one or more discriminants. You just search down both branches at levels with an omitted discriminant.)
A good paper on this data structure: http://portal.acm.org/citation.cfm?id=361007
This article has good diagrams and algorithm descriptions: http://en.wikipedia.org/wiki/Kd-tree
如果您需要具有快速删除/插入功能的空间索引,请查看 PH 树。 它部分基于四叉树,但速度更快,空间效率更高。 它在内部使用 Z 曲线,其空间属性比 H 曲线稍差,但更容易计算。
论文:http://www.globis.ethz.ch/script/publication /download?docid=699
Java 实现:http://globis.ethz.ch/files/2014/11/ph-tree-2014-11-10.zip
另一种选择是 X-tree,也可在此处使用:
https://code.google.com/p/xxl/
If you need a spatial index with fast delete/insert capabilities, have a look at the PH-tree. It partly based on quadtrees but faster and more space efficient. Internally it uses a Z-curve which has slightly worse spatial properties than an H-curve but is much easier to calculate.
Paper: http://www.globis.ethz.ch/script/publication/download?docid=699
Java implementation: http://globis.ethz.ch/files/2014/11/ph-tree-2014-11-10.zip
Another option is the X-tree, which is also available here:
https://code.google.com/p/xxl/
Michael,
感谢您的 Java 代码! 我测试了它,它似乎工作正常,但我注意到位交错函数在递归级别 7 溢出(至少在我的测试中,但我使用了长值),因为“n”值是使用最高的OneBit( )-函数,返回值而不是最高一位的位置; 因此循环会进行不必要的多次交错。
我只是将其更改为以下代码片段,之后效果很好。
Michael,
thanks for your Java code! I tested it and it seems to work fine, but I noticed that the bit-interleaving function overflows at recursion level 7 (at least in my tests, but I used long values), because the "n"-value is calculated using highestOneBit()-function, which returns the value and not the position of the highest one bit; so the loop does unnecessarily many interleavings.
I just changed it to the following snippet, and after that it worked fine.
上面的代码和 java 代码适用于 2D 数据点。 但对于更高维度,您可能需要查看 Jonathan Lawder 的论文:JKLawder。 使用希尔伯特空间填充曲线计算一维和 n 维值之间的映射。
The code and java code above are fine for 2D data points. But for higher dimensions you may need to look at Jonathan Lawder's paper: J.K.Lawder. Calculation of Mappings Between One and n-dimensional Values Using the Hilbert Space-filling Curve.
我找到了一种稍微更有效的交错位的方法。 可以在 斯坦福图形网站。 我包含了我创建的一个版本,它可以将两个 32 位整数交错为一个 64 位长。
显然,B 和 S 局部变量应该是类常量,但为了简单起见,保留了这种方式。
I figured out a slightly more efficient way to interleave bits. It can be found at the Stanford Graphics Website. I included a version that I created that can interleave two 32 bit integers into one 64 bit long.
Obviously, the B and S local variables should be class constants but it was left this way for simplicity.
请参阅 uzaygezen。
See uzaygezen.
有趣的问题!
我做了一些谷歌搜索,好消息是,我找到了希尔伯特值的实现。
潜在的坏消息是,它是在 Haskell 中...
http://www.serpentine.com/blog/2007/01/11/two-Dimension-spatial-hashing-with-space-filling-curves/
它还提出了您可能可以更轻松地计算勒贝格距离度量。
Fun question!
I did a bit of googling, and the good news is, I've found an implementation of Hilbert Value.
The potentially bad news is, it's in Haskell...
http://www.serpentine.com/blog/2007/01/11/two-dimensional-spatial-hashing-with-space-filling-curves/
It also proposes a Lebesgue distance metric you might be able to compute more easily.
下面是我的java代码,改编自Xian Lu和Gunther Schrack发表在Software: Practice and Experience Vol. 11上的论文“Encoding anddecode the Hilbert order”中的C代码。 26 页 1335-46 (1996)。
希望这可以帮助。 欢迎改进!
迈克尔
Below is my java code adapted from C code in the paper "Encoding and decoding the Hilbert order" by Xian Lu and Gunther Schrack, published in Software: Practice and Experience Vol. 26 pp 1335-46 (1996).
Hope this helps. Improvements welcome !
Michael