找到经过大多数点的直线的最有效算法是什么?
问题:
在二维平面上给出 N 个点。同一条直线上最多有多少个点?
该问题有 O(N2) 解决方案:遍历每个点并找到与当前点具有相同 dx / dy
的点的数量。将 dx / dy 关系存储在哈希映射中以提高效率。
有没有比 O(N2) 更好的解决方案?
The problem:
N points are given on a 2-dimensional plane. What is the maximum number of points on the same straight line?
The problem has O(N2) solution: go through each point and find the number of points which have the same dx / dy
with relation to the current point. Store dx / dy
relations in a hash map for efficiency.
Is there a better solution to this problem than O(N2)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
对于这个问题,可能没有比标准计算模型中的 O(n^2) 明显更好的解决方案。
找到三个共线点的问题简化为找到经过最多点的直线的问题,并且找到三个共线点是 3SUM 困难的,这意味着在小于 O(n^2) 的时间内解决它将是一个主要问题理论结果。
请参阅上一个问题 找到三个共线点。
供您参考(使用已知证明),假设我们要回答一个 3SUM 问题,例如在列表 X 中查找 x、y、z,使得 x + y + z = 0。如果我们有一个用于共线点问题的快速算法,我们可以使用该算法来解决 3SUM 问题,如下所示。
对于 X 中的每个 x,创建点 (x, x^3)(现在我们假设 X 的元素是不同的)。接下来,检查创建的点中是否存在三个共线点。
要查看此方法是否有效,请注意,如果 x + y + z = 0,则从 x 到 y 的直线的斜率为
(y^3 - x^3) / (y - x) = y^2 + yx + x ^2
,从 x 到 z 的直线的斜率为
(z^3 - x^3) / (z - x) = z^2 + zx + x^2 = (-(x + y))^2 - (x + y)x + x^2
= x^2 + 2xy + y^2 - x^2 - xy + x^2 = y^2 + yx + x^2
相反,如果从 x 到 y 的斜率等于从 x 到 z 的斜率,则
y^2 + yx + x^2 = z^2 + zx + x^2,
这意味着
(y - z) (x + y + z) = 0,
因此 y = z 或 z = -x - y 就足以证明减少是有效的。
如果 X 中存在重复项,则首先检查任何 x 和重复元素 y 是否为 x + 2y = 0(使用哈希的线性时间或使用排序的 O(n lg n) 时间),然后删除重复项,然后减少到共线寻点问题。
There is likely no solution to this problem that is significantly better than O(n^2) in a standard model of computation.
The problem of finding three collinear points reduces to the problem of finding the line that goes through the most points, and finding three collinear points is 3SUM-hard, meaning that solving it in less than O(n^2) time would be a major theoretical result.
See the previous question on finding three collinear points.
For your reference (using the known proof), suppose we want to answer a 3SUM problem such as finding x, y, z in list X such that x + y + z = 0. If we had a fast algorithm for the collinear point problem, we could use that algorithm to solve the 3SUM problem as follows.
For each x in X, create the point (x, x^3) (for now we assume the elements of X are distinct). Next, check whether there exists three collinear points from among the created points.
To see that this works, note that if x + y + z = 0 then the slope of the line from x to y is
(y^3 - x^3) / (y - x) = y^2 + yx + x^2
and the slope of the line from x to z is
(z^3 - x^3) / (z - x) = z^2 + zx + x^2 = (-(x + y))^2 - (x + y)x + x^2
= x^2 + 2xy + y^2 - x^2 - xy + x^2 = y^2 + yx + x^2
Conversely, if the slope from x to y equals the slope from x to z then
y^2 + yx + x^2 = z^2 + zx + x^2,
which implies that
(y - z) (x + y + z) = 0,
so either y = z or z = -x - y as suffices to prove that the reduction is valid.
If there are duplicates in X, you first check whether x + 2y = 0 for any x and duplicate element y (in linear time using hashing or O(n lg n) time using sorting), and then remove the duplicates before reducing to the collinear point-finding problem.
如果将问题限制为穿过原点的线,则可以将点转换为极坐标(角度、距原点的距离)并按角度对它们进行排序。所有具有相同角度的点都位于同一条线上。 O(n logn)
我认为在一般情况下没有更快的解决方案。
If you limit the problem to lines passing through the origin, you can convert the points to polar coordinates (angle, distance from origin) and sort them by angle. All points with the same angle lie on the same line. O(n logn)
I don't think there is a faster solution in the general case.
霍夫变换可以给你一个近似的解决方案。它是近似的,因为分箱技术在参数空间中的分辨率有限,因此最大分箱将为您提供一些有限的可能线范围。
The Hough Transform can give you an approximate solution. It is approximate because the binning technique has a limited resolution in parameter space, so the maximum bin will give you some limited range of possible lines.
又是一个带有伪代码的 O(n^2) 解决方案。想法是创建一个以行本身作为键的哈希表。直线由两点(直线与 x 轴相交的点和直线与 y 轴相交的点)之间的斜率定义。
解决方案假设 Java、C# 等语言,其中对象的 equals 方法和 hashcode 方法用于哈希函数。
创建一个包含 3 个字段的对象(调用 SlopeObject)
poix
// 将为 (Infinity, some y value) 或 (x value, 0 )poix
将是一个点 (x, y) 对。如果线穿过 x 轴,则poix
将(某个数字,0)。如果线平行于 x 轴,则 poix =(无穷大,某个数字),其中 y 值是线与 y 轴相交的位置。如果
Slope
和poix
相等,则重写 equals 方法,其中 2 个对象相等。哈希码被一个函数覆盖,该函数根据
Slope
和poix
值的组合提供哈希码。下面是一些伪代码Again an O(n^2) solution with pseudo code. Idea is create a hash table with line itself as the key. Line is defined by slope between the two points, point where line cuts x-axis and point where line cuts y-axis.
Solution assumes languages like Java, C# where equals method and hashcode methods of the object are used for hashing function.
Create an Object (call SlopeObject) with 3 fields
poix
// Will be (Infinity, some y value) or (x value, 0)poix
will be a point (x, y) pair. If line crosses x-axis thepoix
will (some number, 0). If line is parallel to x axis then poix = (Infinity, some number) where y value is where line crosses y axis.Override equals method where 2 objects are equal if
Slope
andpoix
are equal.Hashcode is overridden with a function which provides hashcode based on combination of values of
Slope
andpoix
. Some pseudo code below使用 p=(a,b) p*:y=a*x + b 的点线对偶变换移动到对偶平面。
现在使用线扫描算法在 NlogN 时间内找到所有交点。
(如果你有一个在另一个之上的点,只需将这些点旋转到某个小角度)。
双平面中的交点对应于底漆平面中的线。
Move to the dual plane using the point-line duality transform for p=(a,b) p*:y=a*x + b.
Now using a line sweep algorithm find all intersection points in NlogN time.
(If you have points which are one above the other just rotate the points to some small angle).
The intersection points corresponds in the dual plane to lines in the primer plane.
谁说由于 3SUM 对这个问题进行了简化,因此复杂度是 O(n^2)。请注意,3SUM 的复杂度低于此。
请检查 https://en.wikipedia.org/wiki/3SUM 并阅读
https://tmc.web.engr.illinois.edu/reduce3sum_sosa.pdf
Whoever said that since 3SUM have a reduction to this problem and thus the complexity is O(n^2). Please note that the complexity of 3SUM is less than that.
Please check https://en.wikipedia.org/wiki/3SUM and also read
https://tmc.web.engr.illinois.edu/reduce3sum_sosa.pdf
正如已经提到的,可能没有比 O(n^2) 更好的方法来解决这个问题的一般情况。但是,如果您假设大量点位于同一条线上(假设点集中的随机点位于具有最大点数的线上的概率为 p)并且不需要精确的算法,随机算法更有效。
请注意,在第一步中,如果您选择了位于点数最多的线上的 2 个点,您将获得最优解。假设n非常大(即我们可以将找到2个理想点的概率视为放回抽样),则发生这种情况的概率为p^2。因此,经过 k 次迭代后找到次优解的概率为 (1 - p^2)^k。
假设你可以容忍假阴性率rate = err。那么该算法的运行时间为 O(nk) = O(n * log(err) / log(1 - p^2))。如果 n 和 p 都足够大,则这比 O(n^2) 效率更高。 (即假设 n = 1,000,000 并且您知道至少有 10,000 个点位于同一条线上。那么 n^2 将需要 10^12 次操作的量级,而随机算法将需要 10^9 次操作的量级以获得小于 5*10^-5 的错误率。)
As already mentioned, there probably isn't a way to solve the general case of this problem better than O(n^2). However, if you assume a large number of points lie on the same line (say the probability that a random point in the set of points lie on the line with the maximum number of points is p) and don't need an exact algorithm, a randomized algorithm is more efficient.
Note that in the first step, if you picked 2 points which lies on the line with the maximum number of points, you'll get the optimal solution. Assuming n is very large (i.e. we can treat the probability of finding 2 desirable points as sampling with replacement), the probability of this happening is p^2. Therefore the probability of finding a suboptimal solution after k iterations is (1 - p^2)^k.
Suppose you can tolerate a false negative rate rate = err. Then this algorithm runs in O(nk) = O(n * log(err) / log(1 - p^2)). If both n and p are large enough, this is significantly more efficient than O(n^2). (i.e. Supposed n = 1,000,000 and you know there are at least 10,000 points that lie on the same line. Then n^2 would required on the magnitude of 10^12 operations, while randomized algorithm would require on the magnitude of 10^9 operations to get a error rate of less than 5*10^-5.)
$o(n^2)$ 算法不太可能存在,因为问题(甚至检查 R^2 中的 3 个点是否共线)是 3Sum-hard (http://en.wikipedia.org/wiki/3SUM)
It is unlikely for a $o(n^2)$ algorithm to exist, since the problem (of even checking if 3 points in R^2 are collinear) is 3Sum-hard (http://en.wikipedia.org/wiki/3SUM)
这不是一个比 O(n^2) 更好的解决方案,但您可以执行以下操作,
2. 将这组新的平移点平移到相对于新的 (0,0) 的角度。
3.保存每个角度的最大点数(MSN)。
4.选择最大存储数量(MSN),即可解决
This is not a solution better than O(n^2), but you can do the following,
2.Translate this new set of translated points to the angle with respect to the new (0,0).
3.Keep stored the maximum number (MSN) of points that are in each angle.
4.Choose the maximum stored number (MSN), and that will be the solution