寻找一种算法(二维二分查找的版本)
简单问题和已知算法:
我有一个包含 100 个成员的大数组。第一个 X 成员是 0,其余的是 1。找到 X。
我通过二分搜索来解决它:检查成员 50,如果它是 0 - 检查成员 75,依此类推,直到找到相邻的 0 和 1。
我正在寻找针对二维相同问题的优化算法:
我有二维数组 100*100。第 0-X 行和 0-Y 列上的成员为 0,其余为 1。如何找到 Y 和 X?
Easy problem and known algorithm:
I have a big array with 100 members. First X members are 0, and the rest are 1. Find X.
I am solving it by a binary search: Check member 50, if it is 0 - check member 75, etc, until I find adjacent 0 and 1.
I am looking for an optimized algorithm for the same problem in 2-dimensions:
I have 2-dimensional array 100*100. Those members that are on rows 0-X AND on columns 0-Y are 0, and the rest are 1. How to find Y and X?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
编辑:最佳解决方案包括两个简单的二分搜索。
我对下面的冗长而复杂的帖子感到非常抱歉。问题的本质是在包含 100*100 个元素的空间中找到一个点。你能做的最好的事情就是在每一步将这个空间一分为二。你可以用一种复杂的方式来做(我在这篇文章的其余部分中所做的)但是如果你意识到 X 轴上的二分搜索仍然在每一步将研究空间一分为二(Y 轴也是如此)轴)然后你就会明白它是最佳的。
我还是让我做的事,很抱歉我在里面做出了一些强制性的肯定。
如果您正在寻找一个简单的算法(尽管不是最佳的),只需按照建议运行两次二分搜索即可。
但是,如果您想要一个最佳的算法,您可以在 X 和 Y 上查找边界同时。 (需要注意的是,两种算法具有相同的渐近复杂度,但最优算法仍然会更快)
在下面的所有图形中,点(0, 0)都在左下角。
基本上,当您选择一个点并获得结果时,您将空间分成两部分。当你仔细想想,这实际上是你可以从中提取的最大信息量。
如果您选择该点(黑色十字)并且结果为 1(红线),则此表示您要查找的点不能位于灰色空间中(因此必须位于剩余的白色区域中)
另一方面,如果该值为 0(蓝线),则意味着您要查找的点不位于灰色区域(因此必须在剩余的白色区域中)
所以,如果你得到一个 0 结果和一个 1结果,这就是你会得到的:
您要查找的点位于矩形 1、2 或 3 中。您只需检查矩形 3 的两个角即可知道这 3 个矩形中哪一个是好的。
所以算法如下:
注意您正在使用的矩形的左下角和右上角在哪里。
沿着矩形的对角线进行二分搜索,直到您至少偶然发现一次结果为 1 和一次结果为 0。
检查矩形 3 的另外 2 个角(您必须已经知道对角线上两个角的值)可以仅检查一个角来知道正确的矩形(但您必须检查如果右边的矩形是矩形 3,则检查两个角)
确定您要查找的点是否在矩形 1、2 或 3 内
通过将问题减少到好的矩形来重复,直到最终的矩形减少为一个点:这就是您要查找的值
编辑:如果您想要最高最优性,那么当您选择点 (50, 50) 时,您不会将空间切成相等的部分。一个比另一个大三倍。理想情况下,您将选择一个将空间切割成两个相等区域(按面积)的点。
您应该在开始时计算一次
factor = (1.0 - 1.0/sqrt(2.0)) 的值
。然后,当您想要在a
和b
之间进行剪切时,请选择剪切点为a + Factor*(ba)
。当您在点 (100*factor, 100*factor) 处切割初始 100x100 矩形时,两个区域的面积将为 (100*100)/2,因此收敛速度会更快。Edit : The optimal solution consists in two simple binary search.
I'm very sorry for the long and convoluted post I did below. What the problem fundamentally consists in is to find a point in a space that contains 100*100 elements. The best you can do is to divide at each step this space in two. You can do it in a convoluted way (the one I did in the rest of the post) But if you realize that a binary search on the X axis still divides the research space in two at each step, (the same goes for the Y axis) then you understand that it's optimal.
I still let the thing I did, and I'm sorry that I made some peremptory affirmations in it.
If you're looking for a simple algorithm (though not optimal) just run the binary search twice as suggested.
However, if you want an optimal algorithm, you can look for the boundary on X and on Y at the same time. (You have to note that the two algorithm have same asymptotical complexity, but the optimal algorithm will still be faster)
In all the following graphics, the point (0, 0) is in the bottom left corner.
Basically when you choose a point and get the result, you cut your space in two parts. When you think about it that is actually the biggest amount of information you can extract from this.
If you choose the point (the black cross) and the result is 1 (red lines), this means that the point you're looking for can not be in the gray space (thus must be in the remaining white area)
On the other hand, if the value is 0 (blue lines), this means that the point you're looking for can not be in the gray area (thus must be in the remaining white area)
So, if you get one 0 result and one 1 result, this is what you'll get :
The point you're looking for is either in rectangle 1, 2 or 3. You just need to check the two corners of rectangle 3 to know which of the 3 rectangle is the good one.
So the algorithm is the following :
Note where are the bottom left and top right corner of the rectangle you're working with.
Do a binary search along the diagonal of the rectangle until you've stumbled at least once on a 1 result and once a 0 result.
Check the 2 other corners of the rectangle 3 (you'll necessary already know the values of the two corners on the diagonal) It is possible to check only one corner to know the right rectangle (but you'll have to check the two corners if the right rectangle is the rectangle 3)
Determine if the point you're looking for is in rectangle 1, 2 or 3
Repeat by reducing the problem to the good rectangle until the final rectangle is reduced to a point : it's the value you're looking for
Edit : if you want the supremum optimality, you'd not the when you choose the point (50, 50), you do not cut the space in equal part. One is three time bigger than the other. Ideally, you'll choose a point that cuts the space in two equal regions (area-wise)
You should compute once at the beginning the value of
factor = (1.0 - 1.0/sqrt(2.0))
. Then when you want to cut bewteen valuesa
andb
, choose the cutting point asa + factor*(b-a)
. When you cut the initial 100x100 rectangle at the point (100*factor, 100*factor) the two regions will have an area (100*100)/2, thus the convergence will be quicker.运行二分查找两次。首先通过对最后一行进行二分查找来确定 X,然后通过对最后一列进行二分查找来确定 Y。
Run your binary search twice. First determine X by running binary search on the last row and then determine Y by running binary search on last column.
简单的解决方案:先沿 X 方向移动,然后沿 Y 方向移动。
检查(0,50);如果为0,则检查(0,75);直到找到相邻的0和1。然后从那里向Y方向移动。
第二种解决方案:
检查成员 (50,50)。如果是 1,则检查 (25,25),直到找到 0。继续,直到找到相邻的 (X,X) 和 (X+1,X+1) 分别为 0 和 1。然后测试 (X,X +1) 和 (X+1,X)。它们都不为 1,或其中之一为 1。如果两者都不是,那么您就完成了。如果只有一个,例如 (X+1,X),那么您知道盒子的大小在 (X+1,X) 和 (100,X) 之间。使用二分查找来查找框的高度。
编辑:正如克里斯指出的,简单的方法似乎更快。
第二个解决方案(修改):
检查成员(50,50)。如果是 1,则检查 (25,25),直到找到 0。继续,直到找到相邻的 (X,X) 和 (X+1,X+1) 分别为 0 和 1。然后测试 (X,X +1)。如果是1,则在(X,X+1)...(X,100)行进行二分查找。否则在 (X,X)...(100,X) 行上进行二分搜索。
即便如此,我在这里也可能是死马。如果速度更快,那么速度可以忽略不计。这只是为了理论上的乐趣。 :)
编辑 2 正如 Fezvez 和 Chris 所说,二分搜索最有效地将搜索空间一分为二;我的方法是将区域划分为 1/4 和 3/4 块。 Fezvez 指出,可以通过预先计算除法因子来解决这个问题(但这将是额外的计算)。在我的算法的修改版本中,我选择前进的方向(X 或 Y 方向),这也有效地将搜索空间一分为二,然后进行二分搜索。总而言之,这表明这种方法总是会慢一些。 (而且实施起来更复杂。)
谢谢 Igor Oks,提出了有趣的问题。 :)
Simple solution: go first in X-direction and then in Y-direction.
Check (0,50); If it is 0, check (0,75); until You find adjacent 0 and 1. Then go to Y direction from there.
Second solution:
Check member (50,50). If it is 1, check (25,25), until You find 0. Continue, until You find adjacent (X,X) and (X+1,X+1) that are 0 and 1. Then test (X,X+1) and (X+1,X). Neither or one of them will be 1. If neither, You are finished. If only one, say for example (X+1,X), then You know that the box's size is between (X+1,X) and (100,X). Use binary search to find box's height.
EDIT: As Chris pointed out, it seems that the simple approach is faster.
Second solution (modified):
Check member (50,50). If it is 1, check (25,25), until You find 0. Continue, until You find adjacent (X,X) and (X+1,X+1) that are 0 and 1. Then test (X,X+1). If it is 1, then do binary search on line (X,X+1)...(X,100). Else do binary search on line (X,X)...(100,X).
Even then I am probably beating a dead horse here. If it will be faster, then by neglible amount. This is just for theoretical fun. :)
EDIT 2 As Fezvez and Chris put it, binary search divides the search space in two most efficiently; My approach divides the area to 1/4 and 3/4 pieces. Fezvez pointed out that this could be remedied by calculating the dividing factor beforehand (but that would be extra calculation). In modified version of my algorithm I choose the direction where to go (X or Y direction), which effectively also divides the search space in two, and then conduct binary search. To conclude, this shows that this approach will always be a bit slower. (and more complicated to implement.)
Thank You, Igor Oks, for interesting question. :)
对两个维度和一维情况使用二分搜索:
Use binary search on both dimensions and the 1D case: