查找任意子数组中所有项目之和的最佳算法是什么
我有一个问题,还有一个不错的解决方案。我希望有更好的解决方案。
问题
我有一个包含大约 200,000 个整数的数组。给定两个索引 i1 和 i2,我需要计算 i1 和 i2 之间所有元素的总和。数组中的每个整数都在 1 到 4 之间(含 1 和 4)。例如:
a = [1, 3, 2, 4, 3, 2, 4, 1];
subsection_sum(a, 0, 3); // returns 6: (1 + 3 + 2)
此操作将执行大约 200,000 次,因此需要非常快。 for 循环中的简单计数器是 O(n),而且速度太慢。数组在构建后永远不会被修改,因此有一个相对昂贵的预处理阶段是可以的。
迄今为止我最好的解决方案
该算法的工作时间为 O(log n):
首先用零填充原始数组,直到其长度为 2 的幂。接下来,将数组分成两个相等的部分并存储每个部分的总和。然后将数组分成四份并存储每份的总和。然后八分之一。继续这样做,直到数组被分成 2 个元素长的部分。对于上面的 8 元素数组,这需要两个步骤:
halves = [(a[0] + a[1] + a[2] + a[3]), (a[4] + a[5] + a[6] + a[7])]
quarters = [(a[0] + a[1]), (a[2] + a[3]), (a[4] + a[5]), (a[6] + a[7])]
然后给定两个索引,现在可以在 O(log n) 时间内计算出 subsection_sum。例如,subsection_sum(a, 2, 7) ==quarters[1] + half[1]。
I have a problem, and an OK-ish solution. I'm hoping there's a better solution out there.
The problem
I have an array with around 200,000 integers. Given two indices, i1 and i2, I need to calculate the sum of all elements between i1 and i2. Each integer in the array is between 1 and 4 inclusive. For example:
a = [1, 3, 2, 4, 3, 2, 4, 1];
subsection_sum(a, 0, 3); // returns 6: (1 + 3 + 2)
This operation will be performed around 200,000 times, so needs to be pretty fast. A simple counter in a for loop is O(n), and far too slow. The array is never modified after construction, so it's OK to have a relatively expensive pre-processing stage.
My best solution so far
This algorithm works in O(log n) time:
First pad the original array with zeros until its length is a power of two. Next, split the array into two equal parts and store the sum of each. Then split the array into quarters and store the sum of each. Then eighths. Continue doing this until the array is split into sections 2 elements long. For the 8-element array above, this takes two steps:
halves = [(a[0] + a[1] + a[2] + a[3]), (a[4] + a[5] + a[6] + a[7])]
quarters = [(a[0] + a[1]), (a[2] + a[3]), (a[4] + a[5]), (a[6] + a[7])]
Then given two indices, it is now possible to work out the subsection_sum in O(log n) time. For example, subsection_sum(a, 2, 7) == quarters[1] + halves[1].
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
引入一个包含累积和的辅助数组。也就是说,辅助数组的元素
i
具有原始数组的元素 0 到i
的总和。那么子数组的和就是辅助数组中两个元素的差。这将在常数时间内给出结果,O(1)
。这取决于问题中给出的
subsection_sum
函数的不变量:我假设
i1 <= i2
。重新排列,我们有:请注意,右侧的总和都是从
0
开始的。辅助数组可以被视为缓存所有i
的从零开始的总和值,subsection_sum(a, 0, i)
。Introduce an auxiliary array that contains the cumulative sum. That is, element
i
of the the auxiliary array has the sum of elements 0 throughi
of the original array. The subarray sum is then just the difference of two elements from the auxiliary array. This will give a result in constant time,O(1)
.This depends on an invariant in the
subsection_sum
function given in the question,:where I'm assuming
i1 <= i2
. Rearranging, we have:Note that the sums on the right-hand side both start from
0
. The auxiliary array can be viewed as caching the values for the sums from zero,subsection_sum(a, 0, i)
, for alli
.如果您可以负担
O(n)
额外存储空间,则可以创建一个查找表,其第i
个元素是索引0
处元素的总和> 到输入数组中的i
(包含)。在伪代码中:然后您可以使用此表来计算
i1
和i2
之间的array
中所有元素的总和,方法是采用常量的 差值时间。
If you can afford
O(n)
additional storage, you can create a lookup table whosei
th element is the sum of the elements at indices0
throughi
(inclusive) in the input array. In pseudocode:Then you can use this table to compute the sum of all elements in
array
betweeni1
andi2
by taking the differencewhich takes constant time.