求解最近邻的最佳性能关键算法

发布于 2024-08-09 17:19:37 字数 553 浏览 8 评论 0原文

我们有一个 x,y 对的列表。每对代表二维空间上的一个点。我想从这个列表中找到距离特定点 xq,yq 最近的点。对于这个问题,最好的性能关键算法是什么? Lisp of point 不会改变;这意味着我不需要执行插入和删除。我只想找到该集合中目标 xq,yq 点的最近邻居。

编辑1:谢谢大家!正如 Stephan202 猜对的那样,我想重复执行此操作;就像一个函数。列表不一定是排序的(事实上我不明白它是如何排序的?就像一个主键为 2 列 a 和 y 的表?如果有帮助,那么我会对它进行排序)。

我将根据列表构建一次数据结构,然后我将在函数中使用此生成的数据结构(如果此过程本身相关)。

谢谢雅各布;看起来 KD-Tree 数据结构是一个很好的候选答案(我也这么认为。当我得到一些相关结果时我会更新)。

编辑2:我发现,这个问题被命名为“最近邻居”!

编辑3:第一个标题是“搜索算法(用于空间查询和空间索引)(最近邻)”;我选择了一个新标题:“求解最近邻的最佳性能关键算法”。因为我不想对我的初始数据执行插入和删除操作,并且我只想从它们到新点(不会插入)最近的一个,所以我选择(当前)研究 KD 树。感谢大家!

We have a list of x,y pairs. Every pair represents a point on a 2D space. I want to find the closest point from this list, to a specific point xq,yq. What is the best performance-critical algorithm for this problem? Lisp of points is not going to change; which means I do not need to perform insertion and deletion. I want just to find the nearest neighbor of a target xq,yq point in this set.

Edit 1: Thanks to all! As Stephan202 has guessed correctly, I want to do this repeatedly; like a function. An the list is not necessarily sorted (In fact I do not understand how can it be sorted? Like a table with a primary key of 2 columns a and y? If that helps then I will sort it).

I will construct the data structure based on the list one time, and then I will use this generated data structure in the function (if this process itself is relevant).

Thank you Jacob; It seems that KD-Tree data structure is a good candidate for being the answer (And I feel it is. I will update when I get some relevant results).

Edit 2: I have found that, this problem is named "nearest neighbor"!

Edit 3: First title was "In Search of an Algorithm (for Spatial-Querying and Spatial-Indexing) (Nearest Neighbor)"; I have chose a new title: "Best Performance-Critical Algorithm for Solving Nearest Neighbor". Since I do not want to perform insertion and deletion operation on my initial data and I want just the nearest one from them to a new point (which is not going to be inserted), I chose to (currently) working on KD-Trees. Thanks to all!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

看海 2024-08-16 17:19:37

正如 Stephan202 所指出的,如果您计划查找多个点的最接近匹配,则应该使用一棵树。

我会推荐一个 KD 树,它的实现可以在 OpenCV 2.0 等多个包中轻松找到。或者您可以自己实现一个!

编辑:我在这里问了一个关于kd-tree实现的问题 - 可能有用的。

编辑: KD 树已广泛成功地用于 NN 搜索:) - 另外,如果您愿意接受近似匹配,您可以使用 近似最近邻的快速库 (FLANN)。 FLANN 实现存在于 OpenCV 2.0 中。

如果您不需要近似答案,您可以调整 FLANN 参数来搜索整个树。

As Stephan202 noted, if you plan to find the closest-match for more than one point, you should use a tree.

I would recommend a KD-tree, whose implementation can be easily found in several packages like OpenCV 2.0. Or you could implement one yourself!

EDIT: I had asked a question about kd-tree implementations here - might be useful.

EDIT: KD-trees have been widely used successfully for NN searches :) - Also, if you're willing to accept approximate matches, you can use Fast Library for Approximate Nearest Neigbor (FLANN). The FLANN implementation is present in OpenCV 2.0.

If you don't want approximate answers you can tweak the FLANN parameters to search the entire tree.

如何视而不见 2024-08-16 17:19:37

如果查询点 (xq, yq) 发生变化而列表没有变化,则需要计算 点列表的 Voronoi 图。这将为您提供一组多边形或“单元”(其中一些是无限的);每个多边形对应于原始列表中的一个点,称为该单元的“站点”。完全位于一个多边形内部的任何点都比原始列表上的其他站点更接近该多边形的站点。两个多边形之间边界上的任何点与每个站点的距离相等。

一旦您了解了这一步,您就需要一种简单的方法来确定您所在的多边形。这称为 点位置问题

关于此类事情的一本非常非常好的书是计算几何:算法和应用。他们详细讨论了 Voronoi 图计算和点定位的梯形板方法。

如果您不想自己编写代码,也不应该这样做,那么尝试获取像 这样的库CGAL 将为您完成大部分工作。这可能也适用于 KD 树答案,但我具体不知道。

If the query point (xq, yq) varies and the list doesn't, you need to calculate the Voronoi diagram of the list of points. This will give you a set of polygons or "cells" (some of which are infinite); each polygon corresponds to a point from the original list, called the "site" of that cell. Any point which lies entirely inside of one polygon is closer to the site of that polygon than it is to the other sites on the original list. Any point on a boundary between two polygons lies equally distant from each site.

Once you've gotten that far, you need an easy way to figure out which polygon you're in. This is known as the point location problem.

A really, really good book for this kind of thing is Computational Geometry: Algorithms and Applications. They discuss both the Voronoi diagram calculation and the trapezoidal slab method of point location in detail.

If you don't want to do the code yourself, and you shouldn't, then try to get a library like CGAL that will do most of the work for you. This probably applies to the KD-tree answer as well, but I don't specifically know.

是伱的 2024-08-16 17:19:37

您需要一个空间索引

如果你自己推出,你会比选择 R-Tree四叉树算法。

You need a spatial index.

If you roll your own, you can do a lot worse than picking the R-Tree or Quad-tree algorithms.

清醇 2024-08-16 17:19:37

我会选择四叉树。这是最简单的空间结构。在二维中,我通常会推荐四叉树而不是 kd 树,因为它更简单、更快。它的缺点是,如果维度数较高,则消耗更多内存,但如果维度数较多,则差异并不显着。

如果您的坐标是浮点类型,那么有一个很好的优化技巧:
在查询中,您首先必须找到包含所询问的最近点的点的叶节点。为此,您必须从根到叶遍历树 - 在每次迭代中决定要踩到哪个子节点。
将子节点的标识符/地址存储在 Node 结构中的 4 大小的数组中。将查询算法中的点坐标数字化。然后,您只需通过数字化点坐标的 2 个正确位对数组进行索引即可找到正确的子节点。数字化速度很快:用一个简单的 static_cast 来实现。

但首先实现四叉树而不进行优化,因为位操作很容易产生错误。即使没有这种优化,它仍然是最快的解决方案。

I would go with a quadtree. It is the most simple spatial structure. In 2 dimensions i would generally recommend quadtree instead of kd-tree, because it is simpler, faster. Its downside is more memory consumption if the number of dimensions is high, but in case of 2 dimensions the difference is not significant.

There is a nice optimization trick if your coordinates are floating point typed:
In a query you first will have to find the leaf-node that contains the point to which the most near point is asked. To do this you will have to go in the tree from the root to the leaf - in each iteration deciding which child-node to step on.
Store the identifiers/addresses of the child-nodes in a 4-sized array in the Node structure. Digitize the point coordinates in the query algorithm. Then you will be able to find the proper sub-node just by indexing the array by 2 proper bits of the digitized point coordinates. Digitizing is fast: implement it with a simple static_cast.

But first implement the quadtree without optimization because it is easy to make a bug with the bit-operations. Even without this optimization, it still will be the fastest solution.

时光是把杀猪刀 2024-08-16 17:19:37

使用距离公式迭代所有其他点,找到距 Q (xq,yq) 的最小距离。

但是,您没有提供足够的信息来获得性能关键的答案。

例如,如果 Q 是一个非常常见的点,您可能需要计算到 Q 的距离并将其与每个点一起存储。

第二个例子,如果你有大量的点,你可以将这些点组织成部分,并从包含 Q 的部分的同一部分和相邻部分中的点开始。

Iterate through every other point using the distance formula to find the minimum distance from Q (xq,yq).

However, you haven't given enough information for a performance-critical answer.

For example, if Q is a VERY common point, you might want to calculate the distance to Q and store it with each point.

Second example, if you have a huge number of points, you might organize the points into sections and start with points only in the same section and adjacent sections to the section containing Q.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文