我有 N 个矩形,其边平行于 x 轴和 y 轴。还有另一个矩形,模型。我需要创建一个算法来判断模型是否被 N 矩形完全覆盖。
我有一些想法。我认为首先,我需要按矩形的左侧对矩形进行排序(可以在 O(n log n) 时间内完成),然后使用垂直扫描线。
+------------------------------------------------------------> x
|O
| +----+
| +---------+ | |
| | ++----+--+ |
| | +-++----+-+| |
| | | | +-++-+
| +------+ +-------++
| +---------+
|
|
|
|y
蓝色矩形是模型。
首先,我需要抽象算法。对于实现没有特殊要求。矩形可以表示为一对点(左上角和右下角)。
这是准备考试的任务之一。我知道最好的算法可以在 O(n log n) 时间内完成此操作。
I have N rectangles with sides parallel to the x- and y-axes. There is another rectangle, model. I need to create an algorithm that can tell whether the model is completely covered by the N rectangles.
I have some ideas. I think that first, I need to sort the rectangles by their left side (it can be done in O(n log n) time), and then use a vertical sweeping line.
+------------------------------------------------------------> x
|O
| +----+
| +---------+ | |
| | ++----+--+ |
| | +-++----+-+| |
| | | | +-++-+
| +------+ +-------++
| +---------+
|
|
|
|y
The blue rectangle is the model.
First of all, I need the abstract algorithm. There are no special requirements with regard to the realization. A rectangle can be represented as a pair of points (left-top and bottom-right).
This is one of the tasks for preparing for a test. I know that the best algorithm can do this in O(n log n) time.
发布评论
评论(12)
这是一个分而治之的算法。 AVERAGE 案例复杂度非常好,我想说它类似于
O(n log MaxCoords)
。最坏的情况可能是二次的,但我认为这样的测试很难创建。为了让它变得更难,使递归函数调用的顺序随机。也许@Larry 建议的平均时间复杂度为O(n log n)
。我无法找出扫描线解决方案,但对于我尝试过的测试来说,这非常快。
基本上,使用适用于蓝色矩形的递归函数。首先检查蓝色矩形是否被其他矩形之一完全覆盖。如果是,我们就完成了。如果不是,则将其分成 4 个象限,并在这些象限上递归调用该函数。所有 4 次递归调用都必须返回 true。我包含一些绘制矩形的 C# 代码。您也可以将其更改为使用更大的值,但在这种情况下请删除绘图过程,因为这些过程将永远持续下去。我用生成的一百万个矩形和一个边长为十亿的正方形进行了测试,使其未被覆盖,并且提供的答案(
false
)在四核上花费了大约一秒钟。我主要在随机数据上对此进行了测试,但它似乎是正确的。如果事实证明不是,我会删除它,但也许它会让你走上正确的道路。
Here's a divide and conquer algorithm. AVERAGE case complexity is very good and I'd say it's something like
O(n log MaxCoords)
. Worst case could be quadratic though, however I think such a test would be pretty difficult to create. To make it even harder, make the order of the recursive function calls random. Maybe what @Larry suggested can get this toO(n log n)
on average.I can't figure out a sweep line solution, but for the tests I've tried this is very fast.
Basically, use a recursive function the works on the blue rectangle. First check if the blue rectangle is completely covered by one of the other rectangles. If yes, we're done. If not, split it into 4 quadrants, and recursively call the function on those quadrants. All 4 recursive calls must return true. I'm including some C# code that draws the rectangles. You can change it to work with larger values as well, but do remove the drawing procedures in that case, as those will take forever. I've tests it with a million rectangles and a square of side one billion generated such that it isn't covered, and the provided answer (
false
) took about a second on a quadcore.I've tested this on random data mostly, but it seems correct. If it turns out it isn't I'll just delete this, but maybe it'll get you on the right path.
您的扫描线走在正确的轨道上。从概念上讲,我们想要检测模型与扫描线的交点何时未被其他矩形覆盖。高级模板是将每个矩形分解为“左边缘”和“右边缘”事件,按 x 坐标对事件进行排序(如果矩形关闭,则将左侧放在右侧之前,如果矩形打开,则将右侧放在左侧之前),然后在 O(log n) 时间内处理每个事件。这基本上就是功课,我就不多说了。
You're on the right track with the sweep line. Conceptually, we want to detect when intersection of the model with the sweep line is not covered by the other rectangles. The high-level template is to break each rectangle into a "left edge" and a "right edge" event, sort the events by x coordinate (putting lefts before rights if the rectangles are closed and rights before lefts if they are open), and then process each event in O(log n) time. This is basically homework, so I will say no more.
这是一个通用算法,
如果是,则完成,
如果是,
如果 2. 或 3. 为否,则答案为否,否则为是。
现在的问题是如何有效地执行上述操作。上述操作可以在所有多边形上的单个循环中完成,所以我认为您正在考虑 O(n) 时间。
如果您需要创建有效的算法来测试多个模型,或者如果您必须优化尽可能最快的答案(以准备数据为代价),那么您正在寻找一种结构,可以快速回答矩形是否相交的问题(或包含)一个矩形。
如果您使用,例如 kd-trees,我相信这可以在 O( log n) 时间 - 但该算法中的重要变量也是找到的矩形 k 的数量。您最终会得到类似 O(k + log n) 的结果,并且您还需要执行并集部分来测试模型是否被覆盖。
我的猜测是,并集可以在 O(k log k) 中计算,因此如果 k 很小,那么您正在查看 O(log n),如果 k 与 n 相当,那么它应该是 O(k log k)。
另请参阅此问题。
编辑:
应对交集和并集的复杂性。
更详细地说,我们有两种情况,具体取决于 k << n或k与n a)相当
在k≤a)的情况下, 。 n 并假设交集/并集的多项式复杂度(所以这里我放弃猜测 O(k log k) )我们有:
集总计为 O(k + log n + f(k)),它直接等于 O(log n),因为大 O 仅取决于增长最快的项。
在这种情况下,算法更重要的部分是寻找多边形。
b) 在 k 与 n 相当的情况下,我们必须看看交集和并集算法的复杂性
(请注意这里的并行关系,即 RDBM 如何根据选择性使用索引或执行表扫描;这与我们这里的选择类似)。
如果 k 足够大,该算法就不再是查找与矩形相交的矩形的算法,而更像是计算多边形并集的算法。
但是,我相信 kd 树也可以帮助找到线段的交集(这是构建并集所必需的),所以即使这部分算法也可能不需要 k^2 时间。
此时,我将研究计算并集面积的有效算法。
Here's a generic algorithm
if yes you are done
if yes
if 2. or 3. is no then the answer is no, otherwise it is yes
Now the question is how to do the above efficiently. The above can be done in a single loop over all polygons, so I think you are looking at O(n) time.
If you need to create efficient algorithm that will test multiple models, or if you must optimize for fastest answer possible (at the expense of preparing the data) then you are looking for a structure that will allow quick answer to question if a rectangle intersects (or contains) a rectangle.
If you use, for example kd-trees, I believe that this can be answered in O(log n) time - but the important variable in this algorithm is also the number of found rectangles k. You will end up with something like O(k + log n) and you will also need to do the union part to test if the model is covered.
My guess is that the union could be computed in O(k log k), so if k is small then you are looking at O(log n) and if k is comparable to n then it should be O(k log k).
See also this question.
EDIT:
In response to complexity of intersections and unions.
In more details, we have two scenarios depending on if k << n or k comparable to n
a) in case of k << n and assuming polynomial complexity for intersection/union (so here I give up the guess O(k log k) ) we have:
The total is O(k + log n + f(k)), which is directly equal to O(log n) since big O depends only on the fastest growing term.
In this case the more significant part of the algorithm is finding the polygons.
b) in the case of k comparable to n we must take a look at the complexity of intersection and union algorithms
(notice the parallel here on how the RDBMs, depending on selectivity, might use index or do table scan; it is a similar choice to what we have here).
If k is big enough the algorithm becomes less of an algorithm for finding rectangles that intersect with the rectangle and becomes more of an algorithm for calculating the union of polygons.
But, i believe that the kd tree can also help in finding the intersection of segments (which are necessary to build the union), so even this part of algorithm might not need k^2 time.
At this point I would investigate efficient algorithms for calculating the area of unions.
有一种简单的
O(N^2)
方法,类似于所提出的“光栅”方法。由于所有矩形都是轴平行的,因此最多只能有2N
个不同的 x 和 y 维度。对所有 x 和 y 进行排序并重新映射:x_i ->我。现在您有了一个
2N x 2N
矩阵,您可以轻松地使用朴素的O(N^2)
算法。There is a trivial
O(N^2)
approach that is similar to the "raster" approach that is brought up. Since all the rectangles are axis-parallel, there can only be at most2N
distinct x and y dimension. Sort all the x's and y's and remap:x_i -> i
. So now you have a2N x 2N
matrix where you can easily use the naiveO(N^2)
algorithm.好吧,现在看来我什至晚上都睡不着了,因为我想到了这个问题......但似乎我终于得到了一个 O(n log n) 解决方案,平均值 情况(但与
@lVlad
相比,出现病态输入的可能性降低)。我们都知道“分而治之”技术。为了确保
O(n log n)
使用它,我们通常关注两点:O( n)
考虑到这些约束,我详细阐述了以下算法,这让人想起
qsort
,因此遭受同样的陷阱(即分形输入)。算法
red
与blue
相交的部分,它们被插入到一个HashSet中删除重复项 -->O(n)
O(n)
红色
放入分区中,应用裁剪技术上,请注意,给定的red
可能最终会在不同的分区中给出多个块枢轴选择是算法的基石,如果分区不合适,我们就无法达到所需的复杂性。此外,它必须在
O(n)
内完成。到目前为止,我有 2 个建议:最大面积
:使用面积最大的矩形,因为这意味着分区之后的面积将是最小的,但它的缺点是很容易胜过中位数of 3:基于qsort,选取3个元素,选择中位数(中心更接近3个中心的重心的那个)
我建议将它们混合在一起:
实现的另一个方面是递归的尾部。与
qsort
一样,该算法对于较小的n
可能效率较低。我建议使用introsort
技巧,而不是一直到 1:如果n
小于 12,则使用以下算法:红色
投影到轴上(仅边缘)并对它们进行排序(ala introsort)维度不可知的
算法被正式定义为适用于任何给定的维度,具有相同的渐近复杂度,即平均 O(n log n)。这使我们有机会在维度 1 中进行测试以识别病理输入。
病理输入
就像它所建模的
qsort
一样,它对阶乘输入很敏感。我所说的阶乘输入是指:每当您选择间隔的平均值时,所有元素都位于其一侧。
有了这样的输入,即使选择中位数 3 也不太可能产生很好的切割结果。
编辑:
我将用一个小方案来展示分区的想法,正如
@lVlad
指出的那样,它有点模糊。好的,我们检查覆盖范围的矩形现在被分成 9 个子矩形。我们忽略P,它是我们的枢轴。
现在,我们真的希望每个子矩形被较少的
红色
覆盖而不是N
。枢轴被选择为中心的重心,因此这意味着如果我们的“随机”选择成立,则枢轴的每个方向上都有大约同样多的红色中心。那里有点模糊,因为一些特殊的配置可能会使其在一步中几乎没有增益(所有矩形都有相同的中心,例如我们只选择了较小的一个),但它会造成混乱,因此下一步将有所不同。
如果有人能正式说明这一点,我很高兴,我是一名工程师,而不是计算机科学家,而且我的数学落后......
Okay, now it seems I can't even sleep at night because I think about this problem... but it also seems I finally got an O(n log n) solution, in average case (but with reduced chances of having a pathological input compared to
@lVlad
).We all know the Divide and Conquer technic. To ensure
O(n log n)
using it, we usually focus on 2 points:O(n)
With these constraints in mind I have elaborated the following algorithm, which is reminiscent of
qsort
, and thus suffer the same pitfalls (namely, fractal inputs).Algorithm
red
that intersect withblue
, they are inserted in a HashSet to remove duplicates -->O(n)
O(n)
red
in the partitions, applying the Clipping technic, note that a givenred
might end up giving several chunks in different partitionsThe Pivot Choice is the corner stone of the algorithm, if the partition is ill-tailored we cannot achieve the required complexity. Also, it must be accomplished in
O(n)
. I have 2 proposals so far:Maximum Area
: use the rectangle with the greatest area, because it means that the partitions will have the smallest area afterward, however it suffers from being easy to trumpMedian of 3
: based on qsort, pick up 3 elements, selection the median (the one with the center closer to the barycenter of the 3 centers)I propose to mix them up thusly:
Another aspect of implementation is the Tail of the recursion. Like
qsort
it's probable that the algorithm is inefficient for smalln
. Instead of going all the way to 1, I propose to use theintrosort
trick: ifn
is smaller than say 12, then use the following algorithm instead:red
on the axis (only the edges) and sort them (ala introsort)Dimension agnostic
The algorithm is formally defined to be applicable in any given dimension with the same asymptotic complexity, in average O(n log n). This gives us the opportunity to test in dimension 1 to identify the pathological inputs.
Pathological input
Like
qsort
on which it is modelled it is sensible to factorial inputs. By factorial inputs I mean:whenever you pick the average of your interval, you have all the elements on one side of it.
With such an input even choosing the median of 3 is unlikely to yield a very good cut.
EDIT:
I am going to show the partition idea with a little scheme, as
@lVlad
noted it was kind of fuzzy.Okay, so the rectangle we check for coverage is now splitted into 9 subrectangles. We ignore P, it's our pivot.
Now, we would really like that each subrectangle is covered by less
red
thanN
. The pivot is chosen as a barycenter of the centers, thus it means if our "random" choice held true that there are about as manyred
s centers in each direction of the pivot.It's kind of fuzzy there because some special configurations might make it so that there is little gain at one step (all rectangles have the same center and we just picked the smaller one for example), but it will create chaos and thus the following step will be different.
I am happy if anyone can formalize that, I am an engineer, not a computer scientist, and my maths lag behind...
我一直在思考这个问题,我想我终于明白了 @algorithmist 的含义>扫线。然而,
排序
操作的存在意味着我有:O(n log n)
O(n**2 )
在最坏情况扫描线
首先,我们需要一些排序,因为我们的
扫描线
将包括走一个有序集。此有序集将包含每个
red
的top
和bottom
线,只要它们位于top< /code> 和
blue
的bottom
。这将我们的空间划分为(最多)n*2
水平条带。现在,我们需要确保在每个
strip
中,整个blue
被覆盖,并且这个操作不能超过O(log n)
代码>复杂性,如果我们(对于每个条带)有一个覆盖间隔的列表,就可以做到这一点。 这是 @algorithmist 所说的“事件”为了处理这个事件,我们将使用下面描述的二叉树,它可以精确地处理添加或删除矩形
O(log n)
运算并产生树覆盖的最右边的区间,我们用它来判断蓝色条带是否被覆盖。二叉树
首先,我将为
红色
矩形建立索引。我们使用这个函数对它们进行排序:然后我将创建一个专用的二叉树。
处理错误“代码块无法跟随列表”:
现在我们有两种可能性,假设子节点覆盖
[a,b]
和 < code>[c,d]:c <= b
,则节点保存[a,d]
,[c ,d]
为什么它有效?让我们以 4 个叶子为例:
特殊节点忽略
[3,5]
因为它不是最右边的区间。原因是矩形是有序的:[6,9]
右侧的矩形不会覆盖缺失的[5,6]
间隔,因为它们在之后开始>6
[3,5]
左侧的矩形在3
之前开始,因此如果它们覆盖了缺失的[5,6]< /code> 无论如何,它们都会覆盖
[3,5]
根可能不会指示所覆盖的确切间隔集:仅覆盖最右边的间隔。然而,我们完全足以判断
blue
是否被完全覆盖!此树上有 2 个可用操作:
每个操作都是相似的:
递归位需要
O(log n)
。这是平衡二叉树的经典属性。一旦完成,我们就得到了根覆盖的最右边的区间,这足以判断蓝色段是否被完全覆盖。复杂性
算法的复杂性很简单:
O(n)
个事件O(log n)
内处理,结果是 < code>O(n log n) 为核心部分。
但是,我们不应该忘记,我们还有 2 个
sort
操作:每个操作都应采用
平均为 O(n log n)
,但在最坏的情况下可能退化为O(n**2)
,具体取决于所使用的排序算法。I've been thinking about it and I think I finally understood what
@algorithmist
meant by sweep line. However the very presence ofsorting
operations means that I have:O(n log n)
in averageO(n**2)
in the worst caseSweep Line
First of all, we need some sorting, because our
sweep line
will consist of walking an ordered set.This ordered set will feature the
top
andbottom
line of each of thered
s, as long as they are between thetop
andbottom
ofblue
. This divides our space into (at most)n*2
horizontal strips.Now, we need to make sure that in each
strip
, the whole ofblue
is covered, and this operation cannot have more thanO(log n)
complexity, this could be done if we had (for each strip) a list of the covered intervals. This is the 'event'@algorithmist
is speaking ofTo handle this event, we'll use a binary tree described below which handles adding or removing a rectangle in exactly
O(log n)
operations and yields the rightmost interval covered by the tree, which we use to tell if the strip ofblue
is covered or not.Binary Tree
First of all, I am going to index the
red
rectangles. We sort them using this function:I am going then to create a dedicated binary tree.
N
leaves, each representing ared
rectangle and indicating its presence or absence. They are ordered according to the index.Handling the bug "code block cannot follow list":
Now we have two possibilities, let's say the children cover
[a,b]
and[c,d]
:c <= b
, then the node hold[a,d]
[c,d]
Why does it works ? Let's take an example using 4 leaves:
The special node ignore
[3,5]
because it's not the rightmost interval. The reasoning is that the rectangles are ordered:[6,9]
will cover the missing[5,6]
interval since they begin after6
[3,5]
begin before3
, so if they cover the missing[5,6]
they'll cover[3,5]
anywayThe root may not indicate the exact set of intervals covered: only the rightmost interval covered. However, it's perfectly sufficient for us to tell if
blue
is completely covered or not!There are 2 operations available on this tree:
Each is similar:
The recursive bit takes
O(log n)
. It's a classic property of the balanced binary trees. And once it's done we have the rightmost interval covered by the root which is sufficient to tell whether or not theblue
segment is entirely covered or not.Complexity
The complexity of the algorithm is simple:
O(n)
eventsO(log n)
Which yields
O(n log n)
for the core part.However, we should not forget that we also have 2
sort
operations:Each shall take
O(n log n)
in average, but may degenerate intoO(n**2)
in the worst case, depending on the sorting algorithm used.很难知道你在寻找什么,但对我来说,它听起来像一个 R-tree可能有用吗?
Hard to know what you are looking for but it sounds to me like an R-tree might work?
好的,我已经问了足够多的问题,这里有一些答案......
如果数据表示为栅格,则一种算法很简单:
如果数据是矢量,那就有点复杂了。首先定义一个函数,该函数返回表示两个矩形的交集(如果有)的矩形。这很简单。然后继续:
同样,只关心与蓝色矩形相交的红色矩形。对于每个红色矩形,计算该矩形与 UnCoveredRectangle 的交集。相交会导致以下情况之一:
2.1 交集等于 UnCoveredRectangle。工作已经完成。
2.2 交集从 CoveredRectangle 中“咬”出一个矩形块。剩余的 UnCoveredRectangle 将是矩形、L 形块或 U 形块。如果它本身是一个矩形,则将 UnCoveredRectangle 设置为剩余的矩形,然后转到步骤 2。如果 UnCoveredRectangle 是 L 形或 U 形,则将其分割为 2 个或 3 个矩形,并从步骤 2 递归。
如果在 UnCoveredRectangle(部分)的面积发送到 0 之前用完红色矩形,则说明已完成。
好吧,我不知道这个算法的复杂性,但除非矩形的数量很大,否则我不太担心,尽管@den 可能会担心。而且我省略了很多细节。我无法像@den 那样画出漂亮的图表,所以你必须自己想象。
OK, I've asked enough questions, here's something of an answer ...
If the data is represented as a raster one algorithm is trivial:
If the data is vector it's a little more complicated. First define a function which returns the rectangle representing the intersection (if any) of two rectangles. This is simple. Then proceed:
Again, only bother with the Red rectangles which intersect the Blue one. For each Red rectangle, compute the intersection of the rectangle with the UnCoveredRectangle. The intersection will result in one of the following situations:
2.1 The intersection equals the UnCoveredRectangle. The job is finished.
2.2 The intersection 'bites' a rectangular chunk out of the CoveredRectangle. The remaining UnCoveredRectangle will be either a rectangle, an L-shaped piece, or a U-shaped piece. If it is a rectangle itself, set UnCoveredRectangle to be the remaining rectangle, and go to step 2. If the UnCoveredRectangle is L- or U-shaped, split it into 2, or 3, rectangles and recurse from step 2..
If you run out of Red rectangles before the area of (part of) UnCoveredRectangle is sent to 0, you've finished.
OK I haven't got a clue about the complexity of this algorithm, but unless the number of rectangles is huge, I'm not too concerned, though perhaps @den is. And I've omitted a lot of details. And I can't draw nice diagrams like @den did, so you'll have to picture it for yourselves.
这是一种不使用光栅化(即仅使用纯数字)来完成此操作的方法。
注意:这不是 O(n log n),更像是 O(n^2)。然而,它是一个解决方案。无论它是一个答案,如果 O(n log n) 是一个要求,则可能不是。
因此输出应为:
让我说明一下到目前为止的过程
关联的矩形:
现在创建一个空列表
L=[]
,并开始处理坐标和关联的矩形:X=1
列表为空,无需处理
没有什么可删除的
添加 A: L=[ A ]
X=2
列表包含矩形,将列表处理为左边缘 X=1 和右边缘 X=2 的矩形(到目前为止我们已处理的两个坐标),并且使用它们原来的顶部和底部坐标。
没有什么可删除的。
添加 B: L=[ A, B ]
X=3
列表包含矩形,以相同的方式处理列表(A 和 B),将它们视为暂时具有左右坐标 X=2 和 X=3,并使用它们原始顶部和底部坐标。
没有什么可删除的
添加C:L=[ A, B, C ]
X=4
将三个矩形同上处理,临时左右坐标分别为X=3和X=4
删除 B: L=[A, C ]
无需添加
X=5 和 X=6
以完全相同的方式处理它们。
这意味着您最终会得到矩形的“条带”,如下所示(我将它们拉开一点以更清楚地说明它们是条带,但它们像原始图中一样连续并排放置):
好的,所以现在您有了输出,即坐标对的集合,每对都有一个关联的矩形列表。
现在我们表演一个技巧。我们以完全相同的方式处理垂直条带,只是这次我们使用 Y 坐标作为分隔符。让我们处理上面 3 和 4 之间的条带。请记住,条带的左右坐标分别为 3 和 4。
条带看起来像这样:
矩形坐标列表:
新空列表 L=[]
处理坐标:
Y=1
没有要处理的内容 (L= [])
将 A 添加到列表中,L=[ A ]
Y=2
进程 A 临时具有 Y=1 和 2 的顶部和底部坐标(请记住,它还具有 X=3 和 4 的临时左侧和右侧坐标
添加 C, L=[ A, C ]
Y=3
处理 A 和 C,现在两者的临时坐标均为 (3, 2)-(4, 3)
添加 B, L=[ A, B, C ]
Y=4
处理 A、B 和 C,坐标均为 (3, 3)-(4, 4)
去掉C, L=[ A, B ]
Y=5
处理A和B,坐标(3, 4)-(4, 5)
删除 A,L=[ B ]
Y=6
处理 B,坐标 (3, 5)-(4, 6)
最终输出:
Y 坐标对,以及与其关联的矩形(也暂时获得新的 X 坐标) ):
现在,假设您想问这样的问题:B 是否被其他矩形的所有任意组合完全覆盖。
答案可以计算如下:
在上面的例子中,我们看到最终列表中的第三个和第四个矩形包含B,但也包含其他矩形,因此B的那些部分被覆盖,但列表中的最后一个矩形也包含B,但没有其他矩形,因此这部分内容没有被涵盖。
根据原始图表,最终结果将包括如下矩形(每个矩形中的字母表示哪个原始矩形与这个新矩形相关联):
如果我们现在重新审视原始图表,我已经遮蔽了以下部分:上面的算法会发现包含 B,但没有其他矩形:
中间的垂直条表明该部分将作为两个矩形返回,由于垂直条的计算方式,在该位置分割。
我真诚地希望我在这里能让自己被理解。我有一些代码可以帮助您实现每次运行坐标列表,但现在是午夜过后 01:21,我要睡觉了,但如果您希望查看一些实际代码,请留下评论。
或者自己实现它也是一个很好的练习:)
这是包含相关方法的类的链接: RangeExtensions.cs。
该方法是
Slice
方法,只需在页面中搜索即可。所讨论的类型 Range 基本上是从一个值到另一个值的范围,因此除了上述算法之外,还需要一些数据构建和维护。Here's a way to do this without using rasterization, that is, using only pure numbers.
Note: This is not O(n log n), more like O(n^2). It is, however, a solution. Whether it is an answer, probably not if O(n log n) is a requirement.
The output should thus be:
Let me illustrate the process so far
Associated rectangles:
You now create an empty list,
L=[]
, and start processing the coordinates and associated rectangles:X=1
List is empty, nothing to process
Nothing to remove
Add A: L=[ A ]
X=2
List contains rectangles, process list as rectangles that have a left edge of X=1, and a right edge of X=2 (the two coordinates we've processed so far), and use their original top and bottom coordinates.
Nothing to remove.
Add B: L=[ A, B ]
X=3
List contains rectangles, process list (both A and B) the same way, treat them as temporarily having left and right coordinates as X=2 and X=3, and use their original top and bottom coordinates.
Nothing to remove
Add C: L=[ A, B, C ]
X=4
Process the three rectangles the same way as above, temporary left and right coordinates are X=3 and X=4
Remove B: L=[A, C ]
Nothing to add
X=5 and X=6
Process these in the exact same manner.
This means you will end up with "strips" of rectangles, like this (I've pulled them a bit apart to clearer illustrate that they are strips, but they are located side-by-side continously like in the original diagram):
Ok, so now you have your output, a collection of coordinate-pairs, each pair having an associated list of rectangles.
Now we do a trick. We process the vertical strip in the exact same manner, only this time we use the Y coordinates as the delimiters. Let's handle the strip between 3 and 4 above. Remember that the strip has a left and right coordinates of 3 and 4.
Strip looks like this:
List of coordinates with rectangles:
New empty list L=[]
Process the coordinates:
Y=1
Nothing to process (L=[])
Add A to list, L=[ A ]
Y=2
Process A with temporarily having a top and bottom coordinates of Y=1 and 2 (and remember that it also has a temporary left and right coordinates of X=3 and 4
Add C, L=[ A, C ]
Y=3
Process A and C, both now having temporary coordinates of (3, 2)-(4, 3)
Add B, L=[ A, B, C ]
Y=4
Process A, B and C, all having coordinates of (3, 3)-(4, 4)
Remove C, L=[ A, B ]
Y=5
Process A and B, coordinates (3, 4)-(4, 5)
Remove A, L=[ B ]
Y=6
Process B, coordinates (3, 5)-(4, 6)
Final output:
pairs of Y-coordinates, with rectangles associated with them (that also have temporarily got new X-coordinates):
Now, suppose you want to ask the question: Is B fully covered by all any combination of the other rectangles.
The answer can be worked out as follows:
In the above example, we see that the 3rd and 4rd rectangle in the final list contains B, but also contains other rectangles, hence those parts of B is covered, but the final rectangle in the list also contains B, but no other rectangle, hence this part is not covered.
According to the original diagram, the final result would include rectangles as follows (the letters inside each denote which original rectangle is associated with this new rectangle):
If we now take a new look at the original diagram, I have shaded out the parts that the above algorithm would find contains B, but no other rectangle:
The vertical bar in the middle there is to illustrate that the part would be returned as two rectangles, split at that location, due to the way the vertical strips were worked out.
I seriously hope I made myself understood here. I have some code that can help you with the implementation of each run through the lists of coordinates, but it's 01:21 past midnight here and I'm going to bed, but leave a comment if you wish to see some actual code for this.
Or it would be a great exercise to implement it yourself :)
Here's the link to the class containing the method in question: RangeExtensions.cs.
The method is the
Slice
method, just search the page for it. The type in question, Range, is basically a range from one value to another, so there's a bit of data construction and maintenance in addition to the above algorithm.以下是如何使扫描线在 O(n lg n) 中工作。我将重点讨论 BST 如何工作的棘手部分。
保持平衡的 BST,其中包含与当前扫描线相交的每个矩形的起点和终点。 BST的每个节点包含两个辅助字段:minOverlap和deltaOverlap。字段 minOverlap 通常存储与该节点的子树覆盖的区间中的任何点重叠的矩形的最小数量。但是,对于某些节点,该值略有偏差。我们维护一个不变量,即 minOverlap 加上直到根的每个节点的 deltaOverlap 之和,具有与该节点的子树中的区域重叠的矩形的真正最小数量。
当我们插入一个矩形起始节点时,我们总是在叶子处插入(并且可能会重新平衡)。当我们沿着插入路径遍历时,我们将任何非零 deltaOverlap 值“下推”到插入节点的访问路径的子节点,从而更新访问路径上节点的 minOverlap。然后,我们需要在 O(lg n) 时间内将每个节点增加到树中插入节点的“右侧”。这是通过增加插入节点的所有右祖先的 minOverlap 字段并增加插入节点的右祖先的所有右子节点的 deltaOverlap 来完成的。对于插入矩形末端的节点以及删除点,执行类似的过程。可以通过仅修改参与轮换的节点的字段来执行重新平衡操作。您所要做的就是检查扫描中每个点的根,看看 minOverlap 是否为 0。
我省略了处理重复坐标之类的细节(一个简单的解决方案就是在任何节点之前对开放矩形节点进行排序)相同坐标的闭合矩形节点),但希望它能给您带来想法,并且相当有说服力。
Here's how to make a sweepline work in O(n lg n). I will focus on the tricky part of how the BST works.
Keep a balanced BST that contains the start and end of each rectangle that intersects the current sweepline. Each node of the BST contains two auxiliary fields: minOverlap and deltaOverlap. The field minOverlap generally stores the minimum number of rectangles overlapping any point in the interval covered by the subtree of that node. However, for some nodes the value is slightly off. We maintain an invariant that minOverlap plus the sum of deltaOverlap for every node up to the root has the true minimum number of rectangles overlapping a region in the subtree of the node.
When we insert a rectangle-starting node, we always insert at a leaf (and possibly rebalance). As we traverse down the insertion path we "push down" any non-zero deltaOverlap values to the children of the access path of the inserted node, updating the minOverlap of the nodes on the access path. Then, we need to increment every node to the 'right' of the inserted node in the tree in O(lg n) time. This is accomplished by incrementing the minOverlap field of all the right ancestors of the inserted node and incrementing the deltaOverlap of all the right children of the right ancestors of the inserted node. An analogous process is carried out for the insertion of the node that ends the rectangle, as well as the deletion of points. A rebalancing operation can be performed by modifying only the fields of the nodes involved in the rotation. All you have to do is check the root at each point in the sweep to see if the minOverlap is 0.
I've left out details of handling things like duplicate coordinates (one simple solution is just to order the open-rectangle nodes before any close-rectangle nodes of the same coordinate), but hopefully it gives you the idea, and is reasonably convincing.
你知道 矩形并集的面积?
这里您需要做的就是计算两个面积:
如果这些面积相等,则模型是完全被覆盖的,否则就不是。
Do you know the usual worst-case
O(nlogn)
algorithm for the area of the union of rectangles?All you need to do here is to compute the two areas:
If these areas are equal, the model is totally covered, otherwise it isn't.
这是使用一些内存的 O(n lg n) 运行时方法。
使用示例:
我们只对场景中包含“模型”矩形的子部分感兴趣;在此示例中,“模型”矩形为
1,1 -> 6,6
1) 将模型矩形边界内的所有 x 坐标(左侧和右侧)收集到列表中,然后对其进行排序并删除重复项
2) 收集所有将模型矩形边界内的 y 坐标(顶部和底部)放入列表中,然后对其进行排序并删除重复项
3) 根据唯一 x 坐标之间的间隙数创建一个 2D 数组* 唯一 y 坐标之间的间隙数。这可以为每个单元使用一位,并且您可以考虑使用 C++ STL 的 bit_vector() 来进行有效的表示。
4)将所有矩形绘制到该网格中,绘制它出现的单元格:
5)如果有任何单元格保持未绘制,您就知道您的模型没有完全被遮挡(或者您正在测试的任何内容)。
采集坐标和绘制步骤为 O(n),坐标排序为 O(n lg n)。
这改编自我的答案之一:什么是查找重叠矩形面积的有效算法
Here's an O(n lg n) runtime approach using some memory.
Using the example:
We're only interested in the subpart of the scene that contains the 'model' rectangle; in this example, the 'model' rectangle is
1,1 -> 6,6
1) collect all the x coordinates that are within the bounds of the model rectangle (both left and right) into a list, then sort it and remove duplicates
2) collect all the y coordinates that are within the bounds of the model rectangle (both top and bottom) into a list, then sort it and remove duplicates
3) create a 2D array by number of gaps between the unique x coordinates * number of gaps between the unique y coordinates. This can use a single bit per cell, and you can consider using say C++ STL's bit_vector() for an efficient representation.
4) paint all the rectangles into this grid, painting cell it occurs over:
5) Should any cells remain unpainted, you know your model is not completely occluded (or whatever you are testing).
The gathering coordinates and the painting steps are O(n), and the sorting of the coordinates is O(n lg n).
This is adapted from one of my answers to: What is an Efficient algorithm to find Area of Overlapping Rectangles