LINQ 计算 SortedList的移动平均值
我有一个 SortedList
形式的时间序列。我想计算这个系列的移动平均值。我可以使用简单的 for 循环来做到这一点。我想知道是否有更好的方法使用 linq 来做到这一点。
我的版本:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var mySeries = new SortedList<DateTime, double>();
mySeries.Add(new DateTime(2011, 01, 1), 10);
mySeries.Add(new DateTime(2011, 01, 2), 25);
mySeries.Add(new DateTime(2011, 01, 3), 30);
mySeries.Add(new DateTime(2011, 01, 4), 45);
mySeries.Add(new DateTime(2011, 01, 5), 50);
mySeries.Add(new DateTime(2011, 01, 6), 65);
var calcs = new calculations();
var avg = calcs.MovingAverage(mySeries, 3);
foreach (var item in avg)
{
Console.WriteLine("{0} {1}", item.Key, item.Value);
}
}
}
class calculations
{
public SortedList<DateTime, double> MovingAverage(SortedList<DateTime, double> series, int period)
{
var result = new SortedList<DateTime, double>();
for (int i = 0; i < series.Count(); i++)
{
if (i >= period - 1)
{
double total = 0;
for (int x = i; x > (i - period); x--)
total += series.Values[x];
double average = total / period;
result.Add(series.Keys[i], average);
}
}
return result;
}
}
}
I have a time series in the form of a SortedList<dateTime,double>
. I would like to calculate a moving average of this series. I can do this using simple for loops. I was wondering if there is a better way to do this using linq.
my version:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var mySeries = new SortedList<DateTime, double>();
mySeries.Add(new DateTime(2011, 01, 1), 10);
mySeries.Add(new DateTime(2011, 01, 2), 25);
mySeries.Add(new DateTime(2011, 01, 3), 30);
mySeries.Add(new DateTime(2011, 01, 4), 45);
mySeries.Add(new DateTime(2011, 01, 5), 50);
mySeries.Add(new DateTime(2011, 01, 6), 65);
var calcs = new calculations();
var avg = calcs.MovingAverage(mySeries, 3);
foreach (var item in avg)
{
Console.WriteLine("{0} {1}", item.Key, item.Value);
}
}
}
class calculations
{
public SortedList<DateTime, double> MovingAverage(SortedList<DateTime, double> series, int period)
{
var result = new SortedList<DateTime, double>();
for (int i = 0; i < series.Count(); i++)
{
if (i >= period - 1)
{
double total = 0;
for (int x = i; x > (i - period); x--)
total += series.Values[x];
double average = total / period;
result.Add(series.Keys[i], average);
}
}
return result;
}
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
为了实现 O(n) 的渐近性能(如手动编码解决方案所做的那样),您可以使用
Aggregate
函数,如累积值中的函数(实现为匿名类型)包含两个字段:
Result
包含到目前为止构建的结果列表。Working
包含最后一个period-1
元素。聚合函数将当前值添加到工作列表中,构建当前平均值并将其添加到结果中,然后从工作列表中删除第一个(即最旧的)值。“种子”(即累积的起始值)是通过将第一个
period-1
元素放入Working
并将Result
初始化为一个空列表。因此,聚合从元素
period
开始(通过在开头跳过(period-1)
元素)在函数式编程中,这是聚合(或
>fold
) 函数,顺便说一句。两点注释:
该解决方案在“功能上”并不干净,因为在每个步骤中都重复使用相同的列表对象(
Working
和Result
)。我不确定如果未来的编译器尝试自动并行化聚合函数是否会导致问题(另一方面我也不确定,这是否可能......)。纯功能性解决方案应该在每一步“创建”新列表。另请注意,C# 缺乏强大的列表表达式。在一些假设的 Python-C# 混合伪代码中,我们可以编写这样的聚合函数
,在我看来,这样会更优雅:)
In order to achieve an asymptotical performance of O(n) (as the hand-coded solution does), you could use the
Aggregate
function like inThe accumulated value (implemented as anonymous type) contains two fields:
Result
contains the result list build up so far.Working
contains the lastperiod-1
elements. The aggregate function adds the current value to the Working list, builds the current average and adds it to the result and then removes the first (i.e. oldest) value from the working list.The "seed" (i.e. the starting value for the accumulation) is build by putting the first
period-1
elements intoWorking
and initializingResult
to an empty list.Consequently tha aggregation starts with element
period
(by skipping(period-1)
elements at the beginning)In functional programming this is a typical usage pattern for the aggretate (or
fold
) function, btw.Two remarks:
The solution is not "functionally" clean in that the same list objects (
Working
andResult
) are reused in every step. I'm not sure if that might cause problems if some future compilers try to parallellize the Aggregate function automatically (on the other hand I'm also not sure, if that's possible after all...). A purely functional solution should "create" new lists at every step.Also note that C# lacks powerful list expressions. In some hypothetical Python-C#-mixed pseudocode one could write the aggregation function like
which would be a bit more elegant in my humble opinion :)
对于使用 LINQ 计算移动平均线的最有效方法,您不应该使用 LINQ!
相反,我建议创建一个辅助类,它以最有效的方式计算移动平均值(使用循环缓冲区和因果移动平均滤波器),然后是一个扩展方法它可以通过 LINQ 访问。
首先,移动平均线
该类提供了一个非常快速且轻量级的 MovingAverage 过滤器实现。它创建一个长度为 N 的循环缓冲区,并对每个附加数据点计算一次加法、一次减法和一次乘法,这与暴力实现中每个点计算 N 次乘法加法相反。
接下来,对其进行 LINQ 化!
上述扩展方法包装 MovingAverage 类并允许插入到 IEnumerable 流中。
现在就使用它!
For the most efficient way possible to compute a Moving Average with LINQ, you shouldn't use LINQ!
Instead I propose creating a helper class which computes a moving average in the most efficient way possible (using a circular buffer and causal moving average filter), then an extension method to make it accessible to LINQ.
First up, the moving average
This class provides a very fast and lightweight implementation of a MovingAverage filter. It creates a circular buffer of Length N and computes one add, one subtract and one multiply per data-point appended, as opposed to the N multiply-adds per point for the brute force implementation.
Next, to LINQ-ify it!
The above extension methods wrap the MovingAverage class and allow insertion into an IEnumerable stream.
Now to use it!
您已经有了一个答案,告诉您如何可以使用 LINQ,但坦率地说,我不会在这里使用 LINQ,因为与您当前的解决方案相比,它很可能表现不佳,而且您现有的代码已经很清晰了。
不过,您可以保留运行总计并在每次迭代时进行调整,而不是在每个步骤中计算之前
period
元素的总和。也就是说,将以下内容更改为:
这将意味着无论
period
的大小如何,您的代码都将花费相同的时间来执行。You already have an answer showing you how you can use LINQ but frankly I wouldn't use LINQ here as it will most likely perform poorly compared to your current solution and your existing code already is clear.
However instead of calculating the total of the previous
period
elements on every step, you can keep a running total and adjust it on each iteration. That is, change this:to this:
This will mean that your code will take the same amount of time to execute regardless of the size of
period
.该块
可以重写为:
您的方法可能如下所示:
正如您所看到的,linq 非常具有表现力。我建议从一些教程开始,例如 LINQ 简介 和 101 个 LINQ 示例。
This block
can be rewritten as:
Your method may look like:
As you can see, linq is very expressive. I recommend to start with some tutorial like Introducing LINQ and 101 LINQ Samples.
要以更实用的方式执行此操作,您需要一个存在于 Rx 中但不存在于 LINQ 中的
Scan
方法。让我们看看如果我们有一个扫描方法,它会是什么样子
这是扫描方法,取自并调整自 此处:
这应该比 具有更好的性能暴力法,因为我们使用累计来计算 SMA。
这里发生了什么?
首先,我们需要计算第一个周期,我们在此将其称为
种子
。然后,我们根据累积的种子值计算每个后续值。为此,我们需要旧值(即 t-delta)和最新值,我们将序列压缩在一起,一次从头开始,一次按 delta 移动。最后,我们通过为第一个周期的长度添加零并添加初始种子值来进行一些清理。
To do this in a more functional way, you'd need a
Scan
method which exists in Rx but not in LINQ.Let's look how it would look like if we'd have a scan method
And here's the scan method, taken and adjusted from here:
This should have better performance than the brute force method since we are using a running total to calculate the SMA.
What's going on here?
To start we need to calculate the first period which we call
seed
here. Then, every subsequent value we calculate from the accumulated seed value. To do that we need the old value (that is t-delta) and the newest value for which we zip together the series, once from the beginning and once shifted by the delta.At the end we do some cleanup by adding zeroes for the length of the first period and adding the initial seed value.
另一种选择是使用 MoreLINQ 的
Windowed
方法,这可以简化代码显著地:Another option is to use MoreLINQ's
Windowed
method, which simplifies the code significantly:我使用此代码来计算 SMA:
I use this code to calculate SMA:
这是一个扩展方法:
Here is an extension method: