如何使用KDTree进行任意维度的top-k查询和范围查询
我使用了KD-tree(libkdtree++)来存储一个多维数据集,这里的要求是这个数据集可以支持不同维度上的top-k/range查询。例如,KDTree<3,Point>。树:查找具有最高 Point[1](y 轴)值的前 100 个点。
从libkdtree++的实现来看,类似的是“find_within_range”函数,但它是根据“曼哈顿距离”来计算的,这里等于max(x_dist, max(y_dist, z_dist))。如何在一维上使用范围查询?
I have used KD-tree(libkdtree++) to store a multi-dimensional data set, and the requirements here is this data set can support top-k/range queries on different dimensions. For example, a KDTree<3, Point> tree: to find the top 100 points whose have highest Point[1](y axis) values.
From the implementation of libkdtree++, what's similar is the "find_within_range" functions, however it is counted based on "Manhattan distance", which equals max(x_dist, max(y_dist, z_dist)) here. How can I just use range query on one dimension?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
看看代码,看起来你不能以简单的方式做到这一点,这很可笑。如果我是你,我会很想破解这个库或者编写我自己的 kd 树。我会在他们的邮件列表上询问以确定,但看起来你可能必须做这样的事情:
这是对 Y 的极其低效的二分搜索,其中 count(y <= Y) == 100我对图书馆不熟悉,但这是我粗略检查过的最好的图书馆。
Looking at the code, it looks like you can't do that in a straightforward way, ridiculously enough. If I were you I'd be tempted to either hack the library or write my own kd-tree. I'd ask on their mailing list to be sure, but it looks like you might have to do something like this:
This is a horribly inefficient binary search for the Y at which count(points with y <= Y) == 100. I'm not familiar with the library, but that's the best I've got on a cursory inspection.