给定两个(大)点集,我如何有效地找到彼此最接近的点对?

发布于 2024-10-18 16:27:17 字数 243 浏览 3 评论 0原文

我需要解决一个计算问题,该问题归结为搜索两个集合之间最接近的点对。问题是这样的:

给定欧几里德空间中的一组点 A 和一组点 B,找到所有对 (a,b),使得 b 是 B 中与 a 最近的点,a 是 A 中最近的点至 b.

集合 A 和 B 的大小大致相等,我们将这个大小称为 N。对于我的特定问题,N 大约为 250,000。

蛮力解决方案是将每个点与其他每个点进行比较,其时间复杂度为二次。有没有更有效的算法来做到这一点?

I need to solve a computational problem that boils down to searching for reciprocally-nearest pairs of points between two sets. The problem goes something like this:

Given a set of points A and a set of points B in euclidean space, find all pairs (a,b) such that b is the closest point in B to a and a is the closest point in A to b.

The sets A and B are of approximately equal size, and we will call this size N. For my particular problem N is approximately 250,000.

The brute force solution is to compare every point against every other point, which has quadratic time complexity. Is there any more efficient algorithm for doing this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

花伊自在美 2024-10-25 16:27:17

当我必须进行最近邻搜索时,我发现一个非常有用的数据结构是k-d-tree。 Wikipedia 有很好的概述,是一个很好的深入讨论如果您正在实现自己的算法(尽管很可能已经存在一个库 - 您没有提及您正在使用哪种语言)。 kd 树最重要的一点是它允许在 O(log N) 时间内执行最近邻搜索。

这样,您可以在 O(N log N) 时间内生成两个列表:A 的成员及其在 B 中的最近邻居以及 B 的成员及其在 A 中的最近邻居。然后,您可以比较列表以查看哪些对匹配。如果天真地这样做,那就是 O(N^2),尽管您可能可以想出一种更快的方法。

[编辑]你让我思考;这是我的第二个想法:

for(a in A)
    b := nearest(B, a)
    if a = nearest(A, b)
        add (a, b) to results
    end if
end for

function nearest(X, y)
    return nearest member of set X to point y
end function

根据我的计算,这是 O(N log N)。

A data structure I found very useful when I had to do nearest neighbour searches was a k-d-tree. Wikipedia has a nice overview and this is an excellent in-depth discussion of the algorithm if you're implementing your own (although a library may well exist already - you don't mention which language you're using). The most important thing about a kd-tree is that it allows nearest-neghbour searches to be performed in O(log N) time.

In that way, you could produce two lists of - the members of A and their nearest neighbour in B and the members of B and their nearest neighbour in A - in O(N log N) time. Then, you could compare the lists to see which pairs match. Done naively, that's O(N^2), though you might be able to think of a way to do it faster.

[edit] You've got me thinking; here's my second thought:

for(a in A)
    b := nearest(B, a)
    if a = nearest(A, b)
        add (a, b) to results
    end if
end for

function nearest(X, y)
    return nearest member of set X to point y
end function

By my reckoning, that's O(N log N).

放赐 2024-10-25 16:27:17

抱歉,我选择了一个相当旧的线程,但我只是想添加一个我在算法设计课教科书中找到的解决方案:

有一种分而治之(想想合并排序)的方法来解决这个问题,应该是O(n logn),我只看到它用于查找一组点内的最短距离,但它应该很容易适应要求每个配对由来自不同集合的点组成。

  1. 根据 X 值对所有点进行排序。
  2. 将整个集合分成两等份。
  3. 在每一半上递归并选择两者的最小距离 (d)
  4. 找到左半部分最右边的点 (p) 并检查之间所有点的距离p_xp_x + d,如果其中任何一个距离短于 d(即要返回的 d),否则返回d

Sorry for picking up a rather old thread but I just wanted to add a solution I've found in my textbook for an Algorithm Design class:

There is a divide-and-conquer (think merge-sort) approach to solve this problem that should be O(n logn), I've only seen it for finding the shortest distance within one set of points but it should be easily adapted to require each pairing to consist of points from different sets.

  1. Sort all points according to X-value.
  2. Split the whole set in two equal parts.
  3. Recurse on each half and pick the minimal distance of the two (d)
  4. Find the right-most point (p) in the left half and check the distance for all points between p_x and p_x + d, if any of these distances are shorter than d that is the d to return, otherwise return d.
半世蒼涼 2024-10-25 16:27:17
我不是你的备胎 2024-10-25 16:27:17

老线程,但我看到有一个最近的评论。

我相信对于 n 维点集,可以通过查找到集合差值原点的近点来找到两个集合之间的近点。您可以查找贝尔实验室的 Phillip Wolfe 撰写的论文,他在其中阐述了该算法。你可以这样想:在集合 A 中取一个随机点,在集合 B 中找到最近的点,然后在集合 B 中找到距离该点最近的点,依此类推。 http://link.springer.com/article/10.1007%2FBF01580381

Old thread, but I see there is a pretty recent comment.

I believe for an n dimensional set of points the near point between two sets can be found by finding the near point to the origin of the set difference. You can seek out the paper by Phillip Wolfe of Bell Labs where he lays out the algorithm. You can think of it by taking a random point in set A, finding the closest point in set B, then finding the closest point to the point in set B and so on. http://link.springer.com/article/10.1007%2FBF01580381

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文