kd 树对于 kNN 搜索是否有效? k 最近邻搜索
我必须在 kd 树中实现 10 维数据的 k 最近邻搜索。
但问题是,我的算法对于 k=1 非常快,但对于 k>1 (k=2,5,10,20,100) 慢 2000 倍,
这对于 kd 树来说是正常的,还是我在做一些磨损的事情?
I have to implement k nearest neighbors search for 10 dimensional data in kd-tree.
But problem is that my algorithm is very fast for k=1, but as much as 2000x slower for k>1 (k=2,5,10,20,100)
Is this normal for kd trees, or am I doing something worng?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
嗯,这主要取决于您的特定实现和数据集。
一棵不平衡的树意味着你必须搜索比你需要的更多的数据。确保你的树结构是健全的。
它还可能取决于您如何找到 k 个邻居。如果您的算法在树中搜索最近的邻居并存储它,然后搜索第二个最近的邻居并存储它等等,那么您的搜索效率不高。相反,当您发现遍历树的更近的点时,请保留 k 个最近邻居和碰撞点的运行列表。这样您就可以搜索一次,而不是搜索 k 次。
不管怎样,听起来你是为了一门课程而这样做。尝试与您的教授、助教或同学交谈,看看您的结果是否具有典型性。
Well, it primarily depends on your particular implementation and data set.
A poorly balanced tree will mean you have to search way more data than you need to. Make sure that your tree construction is sane.
It could also depend on how you find the k neighbors. If your algorithm searches the tree for the closest neighbor and stores it, then searches for the second closest and stores it etc, then you're not doing the search very efficiently. Instead keep a running list of the k nearest neighbors and bump points out of the list as you find closer ones traversing the tree. This way you search once, instead of k times.
Either way, it sounds like you're doing this for a course. Try talking to your professor, TAs, or classmates to see if your results are typical.
我知道这个问题已经得到解答,但有关使用 kd 树进行 KNN 搜索的更多详细信息,请参阅 Bentley (1975:514),Communications of the ACM 18(9),9 月。
I know this question has been answered, but for more detail on KNN searches with k-d trees, see Bentley (1975:514), Communications of the ACM 18(9), September.