用于计算滚动窗口与给定线函数的平方和距离的算法

发布于 2024-12-25 01:58:04 字数 591 浏览 2 评论 0原文

给定一个线函数y = a*x + bab是以前已知的常数),很容易计算总和:直线与样本窗口之间的平方距离 (1, Y1), (2, Y2), ..., (n, Yn) (其中 Y1是最旧的样本,Yn 是最新的):

sum((Yx - (a*x + b))^2 for x in 1,...,n)

我需要一个快速算法来计算滚动窗口(长度为 n )的该值 - 我无法在每次新样本到达时重新扫描窗口中的所有样本。
显然,应该为进入窗口的每个新样本和离开窗口的每个旧样本保存和更新某些状态。
请注意,当样本离开窗口时,其余样本的不定数也会发生变化 - 每个 Yx 都变为 Y(x-1)。因此,当样本离开窗口时,窗口中的每个其他样本都会为新的总和贡献不同的值:(Yx - (a*(x-1) + b))^2 而不是 <代码>(Yx - (a*x + b))^2。

是否有已知的算法来计算这个?如果没有,你能想到一个吗? (由于一阶线性近似,出现一些错误是可以的)。

Given a line function y = a*x + b (a and b are previously known constants), it is easy to calculate the sum-of-squares distance between the line and a window of samples (1, Y1), (2, Y2), ..., (n, Yn) (where Y1 is the oldest sample and Yn is the newest):

sum((Yx - (a*x + b))^2 for x in 1,...,n)

I need a fast algorithm for calculating this value for a rolling window (of length n) - I cannot rescan all the samples in the window every time a new sample arrives.
Obviously, some state should be saved and updated for every new sample that enters the window and every old sample leaves the window.
Notice that when a sample leaves the window, the indecies of the rest of the samples change as well - every Yx becomes Y(x-1). Therefore when a sample leaves the window, every other sample in the window contribute a different value to the new sum: (Yx - (a*(x-1) + b))^2 instead of (Yx - (a*x + b))^2.

Is there a known algorithm for calculating this? If not, can you think of one? (It is ok to have some mistakes due to first-order linear approximations).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

独﹏钓一江月 2025-01-01 01:58:04

难道直接的方法就不能解决问题吗?……

我所说的“直接”是指维护一个样本队列。一旦新样本到达,您将:

  • 从队列中弹出最旧的样本,
  • 从总和中减去其距离
  • 将新样本添加到队列中
  • 计算其距离并将其添加到总和

至于时间,这里的所有内容都是 O(1) 如果队列被实现为链表或类似的东西,您也希望将距离与样本存储在队列中,因此您只计算一次。因此,每个样本的内存使用量为 3 个浮点数 - O(n)。

Won't a straightforward approach do the trick?...

By 'straightforward' I mean maintaining a queue of samples. Once a new sample arrives, you would:

  • pop the oldest sample from the queue
  • subtract its distance from your sum
  • append the new sample to the queue
  • calculate its distance and add it to your sum

As for time, everything here is O(1) if the queue is implemented as linked list or something similar, You would want to store the distance with your samples in queue, too, so you calculate it only once. The memory usage is thus 3 floats per sample - O(n).

天赋异禀 2025-01-01 01:58:04

如果展开术语 (Yx - (a*x + b))^2,这些术语将分为三个部分:

  1. 仅包含 a,x 的术语代码>和<代码>b。当对 n 求和时,它们会产生一些常量,并且可以忽略。
  2. Yxb 项。这些可以按照 @Xion 描述的 Boxcar 积分器的方式进行处理。
  3. -2*Yx*a*x 中的一项。 -2*a 是一个常量,因此请忽略该部分。考虑部分和 S = Y1*1 + Y2*2 + Y3*3 ... Yn*n。给定 Y1 和运行总和 R = Y1 + Y2 + ... + Yn,您可以找到消除 Y1 的 S - R *1 并减少其他每一项,得到 Y2*1 + Y3*2 + ... + Yn*(n-1)。现在,通过减去 Y1 并加上 Y(n+1),更新运行总和 R,如 (2) 所示。将新的 Yn*n 项添加到 S 中。

现在只需将所有这些部分项相加即可。

If you expand the term (Yx - (a*x + b))^2 the terms break into three parts:

  1. Terms of only a,x and b. These produce some constant when summed over n and can be ignored.
  2. Terms of only Yx and b. These can be handled in the style of a boxcar integrator as @Xion described.
  3. One term of -2*Yx*a*x. The -2*a is a constant so ignore that part. Consider the partial sum S = Y1*1 + Y2*2 + Y3*3 ... Yn*n. Given Y1 and a running sum R = Y1 + Y2 + ... + Yn you can find S - R which eliminates Y1*1 and reduces each of the other terms, leaving you with Y2*1 + Y3*2 + ... + Yn*(n-1). Now update the running sum R as for (2) by subtracting off Y1 and adding Y(n+1). Add the new Yn*n term to S.

Now just add up all those partial terms.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文