计算移动最大值
我有一个(大)数值数据数组(大小 N
),并且想要计算大批固定窗口大小 w 的运行最大值。
更直接地,我可以为 k >= 定义一个新数组
(这假设从 0 开始的数组,如 C++ 中一样)。out[k-w+1] = max{data[k-w+1,...,k]}
w-1
有没有比 N log(w)
更好的方法?
[我希望 N
中应该有一个线性的,而不依赖于 w
,就像移动平均线一样,但找不到它。对于 N log(w)
我认为有一种方法可以使用排序的数据结构进行管理,该数据结构将执行 insert()
、delete()
和 extract_max()
总共在 log(w)
或更少的大小为 w
的结构上——例如排序的二叉树] 。
非常感谢。
Possible Duplicate:
Find the min number in all contiguous subarrays of size l of a array of size n
I have a (large) array of numeric data (size N
) and would like to compute an array of running maximums with a fixed window size w
.
More directly, I can define a new array out[k-w+1] = max{data[k-w+1,...,k]}
for k >= w-1
(this assumes 0-based arrays, as in C++).
Is there a better way to do this than N log(w)
?
[I'm hoping there should be a linear one in N
without dependence on w
, like for moving average, but cannot find it. For N log(w)
I think there is a way to manage with a sorted data structure which will do insert()
, delete()
and extract_max()
altogether in log(w)
or less on a structure of size w
-- like a sorted binary tree, for example].
Thank you very much.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
确实有一种算法可以在 O(N) 时间内完成此操作,而不依赖于窗口大小 w。这个想法是使用支持以下操作的巧妙数据结构:
这本质上是一个队列数据结构,支持访问(但不删除)最大元素。令人惊讶的是,正如这个早期问题中所见,可以实现此数据结构,使得这些操作中的每一个都以摊销的 O(1) 时间运行。因此,如果使用此结构将 w 个元素入队,然后在根据需要调用 find-max 的同时不断将另一个元素出队并入队到该结构中,则只需要 O(n + Q) 时间,其中 Q 是元素的数量您提出的查询。如果您只关心每个窗口的最小值一次,则最终结果为 O(n),而不依赖于窗口大小。
希望这有帮助!
There is indeed an algorithm that can do this in O(N) time with no dependence on the window size w. The idea is to use a clever data structure that supports the following operations:
This is essentially a queue data structure that supports access (but not removal) of the maximum element. Amazingly, as seen in this earlier question, it is possible to implement this data structure such that each of these operations runs in amortized O(1) time. As a result, if you use this structure to enqueue w elements, then continuously dequeue and enqueue another element into the structure while calling find-max as needed, it will take only O(n + Q) time, where Q is the number of queries you make. If you only care about the minimum of each window once, this ends up being O(n), with no dependence on the window size.
Hope this helps!
我将演示如何使用列表执行此操作:
长度为
N=23
且W = 4
。为列表创建两个新副本:
从
i=0
循环到N-1
。如果i
不能被W
整除,则将L1[i]
替换为max(L1[i],L1[i- 1])
。从
i=N-2
循环到0
。如果i+1
不能被W
整除,则将L2[i]
替换为max(L2[i], L2[ i+1])
。制作一个长度为
N + 1 - W
的列表L3
,使得L3[i] = max(L2[i]
,L1[i + W - 1])
那么这个列表
L3
是你寻找的移动最大值,L2[i]
是之间范围的最大值code>i
和下一个垂直线,而l1[i + W - 1]
是垂直线和i + W - 1
之间的范围的最大值。I'll demonstrate how to do it with the list:
with length
N=23
andW = 4
.Make two new copies of your list:
Loop from
i=0
toN-1
. Ifi
is not divisible byW
, then replaceL1[i]
withmax(L1[i],L1[i-1])
.Loop from
i=N-2
to0
. Ifi+1
is not divisible byW
, then replaceL2[i]
withmax(L2[i], L2[i+1])
.Make a list
L3
of lengthN + 1 - W
, so thatL3[i] = max(L2[i]
,L1[i + W - 1])
Then this list
L3
is the moving maxima you seek,L2[i]
is the maximum of the range betweeni
and the next vertical line, whilel1[i + W - 1]
is the maximum of the range between the vertical line andi + W - 1
.