如何处理整数矩阵以求 O(1) 内任意子矩形元素的平均值?

发布于 2024-10-20 04:58:53 字数 440 浏览 2 评论 0原文

这是一个面试问题:如何处理整数矩阵以求出元素的平均值O(1) 中是否有任何子矩形?正如评论所说,我们应该计算前缀和。

好吧,这很尴尬,但即使问了类似的问题我没明白。

This is an interview question: How to process a integer matrix to find the average for the elements of any sub-rectangle in O(1)? As the comments say, we should calculate the prefix sums.

Well, it is embarrassing but even having asked a similar question I did not get it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

梦巷 2024-10-27 04:58:53

虽然你有几个合理的答案,但这似乎是一个一张图片胜过几千字的问题。

在此处输入图像描述

好的——我们想要的是矩形 4 的面积。但是,我们预先计算的面积只是告诉我们使用整个矩形的面积,从左上角的原点开始,一直延伸到 4 的右下角。要计算 4 的面积,我们必须减去上方和左侧的面积(1、2 和 3)。区域 2 & 3 的方法是一样的:我们只能从左上角开始一直延伸到右下角来获取它们的区域。这意味着我们可以获得它们的面积之和,但是矩形 1 的面积包含在两者中。

因此,为了获得矩形 4 的面积,我们需要找到其右下角的面积。我们找到 2 和 3 右下角的面积并将它们相加。然后,我们减去矩形 1 右下角的面积,因为它两次包含在我们的总计中。这样我们就得到了矩形 1、2 和 3 的总面积。我们用矩形 4 右下角的面积减去它,就得到了矩形 4 本身的面积。

举例来说,假设我设法使所有这些矩形具有相同的大小,因此我们将每个区域称为“1”。因此,每个矩形右下角给出的面积将是:

  1. 1
  2. 2
  3. 2
  4. 4

为了简单起见,我们将每个矩形定义为 pc_area(x) (即我们的预计算区域) d 到达矩形 x) 的右下角。

因此,我们采用:pc_area(4) - (pc_area(2) + pc_area(3) - pc_area(1))
替换数字,我们得到:4 - (2 + 2 -1),或4-3,这显然给出了 1,我们最初定义为正确的答案。

Though you have a couple of reasonable answers, this seems like a question for which a picture is worth several thousand words.

enter image description here

Okay -- what we want is the area of rectangle 4. Our pre-computed area, however, only tells use the area of the whole rectangle starting from the origin at the top left, and extending to the bottom-right corer of 4. To compute only the area of 4, we have to subtract the areas above and to the left (1, 2, & 3). The areas of 2 & 3 are the same way though: we can only get their areas starting from the top left and extending to their bottom right. That means we can get the sum of their areas, but the area of rectangle 1 is included in both of those.

So, to get the area of rectangle 4, we find the area at its bottom right corner. We find the area at the bottom-right corners of 2 and 3 and add them together. We then subtract the area at the bottom-right corner of rectangle 1 since it was included in our total twice. That gives us the total area of rectangles 1, 2 and 3. We subtract that from the area for the bottom-right corner of rectangle 4, and that gives us the area of rectangle 4 by itself.

Just for example, let's assume I managed to get all those rectangles the same size, so we'll call the area for each '1'. So, the area given at the bottom-right corner of each rectangle will be:

  1. 1
  2. 2
  3. 2
  4. 4

To keep things simple, let's define each of these as pc_area(x) (i.e. precomputed-area we'd get at the bottom-right corner of rectangle x).

So, we take: pc_area(4) - (pc_area(2) + pc_area(3) - pc_area(1))
Substituting the numbers, we get: 4 - (2 + 2 -1), or 4-3, which obviously gives 1, the answer we originally defined as correct.

谁许谁一生繁华 2024-10-27 04:58:53

请参阅@Sven Marnach 对@Michael 链接的帖子的回答。他概述的前缀和算法基于以下思想:如果您已经预先计算了以下元素序列的和:[0, 0], [0, 1], [0, 2], ..., [0, n-1],则可以在常数时间内计算序列 [a, b] 的总和为 [0, b] - [0, a-1]。同样的想法可以应用于二维数组。将左上角为(a, b)、右下角为(c, d)的子数组表示为[(a, b), (c, d)]。我们将像对待一维情况中的序列一样对待这样的子数组。预先计算左上角位于 (0, 0) 的所有子数组的总和: [(0, 0), (0, 0)], [(0, 0), (0, 1)], [(0, 0), (0, 2)], ..., [(0, 0), (0, m-1)], [[(0, 0), (1, 0)], [(0, 0) ), (1, 1)], ..., [(0, 0), (1, m-1)], ..., [(0, 0), (n-1, m-1)] 。有 n*m 个这样的子数组,并且可以根据较小子数组的总和在恒定时间内计算每个子数组的总和。如果现在要求我们生成子数组 [(a, b), (c, d)] 的和,我们可以发现它为 [(0, 0), (c, d)] - [(0, 0 ), (a-1, d)] - [(0, 0), (c, b-1)] + [(0, 0), (a-1, b-1)]。将其画在纸上,您将看到子数组如何重叠以及为什么我们需要添加最后一个子数组。

See @Sven Marnach'answer to the post @Michael linked to. The prefix sum algorithm he outlines is based on the following idea: if you have precomputed the sums of the following element sequences: [0, 0], [0, 1], [0, 2], ..., [0, n-1], you can compute the sum of the sequence [a, b] in constant time as [0, b] - [0, a-1]. The same idea can be applied to two-dimensional arrays. Denote the subarray with its upper left corner at (a, b) and its lower right corner at (c, d) as [(a, b), (c, d)]. We will treat such a subarray similarly to how we treated sequences in the 1-D case. Precompute the sums of all subarrays having their upper left corner at (0, 0): [(0, 0), (0, 0)], [(0, 0), (0, 1)], [(0, 0), (0, 2)], ..., [(0, 0), (0, m-1)], [[(0, 0), (1, 0)], [(0, 0), (1, 1)], ..., [(0, 0), (1, m-1)], ..., [(0, 0), (n-1, m-1)]. There are n*m such subarrays, and the sum of each can be computed in constant time based on the sums of smaller subarrays. If we are now asked to produce the sum of the subarray [(a, b), (c, d)], we can find it as [(0, 0), (c, d)] - [(0, 0), (a-1, d)] - [(0, 0), (c, b-1)] + [(0, 0), (a-1, b-1)]. Draw it on paper, and you'll see how the subarrays overlap and why we need to add the last subarray.

东风软 2024-10-27 04:58:53

我认为您被这样的假设误导了:您只是得到一个矩阵/数字列表。

在这种情况下,该要求是不可能的。您可以做的最快的事情是基于排序算法的启发式算法,您可以实现 O(Log(n))。

然而,只有在存在容错并且值的分布不失真的情况下,这才有用。

但是如果您能够控制数据结构,您就可以很好地解决您的问题。执行此操作的常用技术称为前缀和

在维基百科上阅读它,他们解释得比我更好。

但让我提供一个非常基本的、琐碎的类比,可以解决混乱。

您无法在 O(1) 时间内搜索列表中的元素。 Log(n) 是您能做的最好的事情,如果它已排序或您之前已排序。

但是,如果您可以控制数据结构,则可以构建一个哈希索引,您可以直接选择它并具有 O(1)。

前缀和是类似的东西。它是一种解决您的问题的数据结构,而不是一种算法(当然它不是一种算法,但更多的是在线的另一边)。

I think you are misleaded by the assumption that you just get a matrix / list of nubmers.

In this case the requirement is not possible. The fastest you could do is an heuristic algorithm that is based on a sorting algorithm and you could achieve O(Log(n)).

This however will only be useful if there is an error tolerance and the distribution of values is not distored.

But you can very well solve your problem if you have control over the data structure. The common teqnique to do that is called prefix sums.

read up on it on wikipedia, they explain it better than i ever could.

But let me provide a very basic trivial analogy that may solve confusion.

You cannot search an element in a list in O(1). Log(n) is the best you can do, if its sorted or you sort it before.

However if you have control over the data structure you can build a hash index where you can directly pick it and have O(1).

Prefix sums is something similar. Its a datastructure that solves your problem, not an algorithm (well not of course its an algorithm but more on the other side of the line).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文