如何检测“丢失”的信息? IEnumerable中的元素?
我有一个 IEnumerable
,其中包含在其中一个属性中具有一致间隔的数据元素列表:
List<Interval> list = new List<Interval>
{
new Interval{ TIME_KEY = 600},
new Interval{ TIME_KEY = 605},
new Interval{ TIME_KEY = 615},
new Interval{ TIME_KEY = 620},
new Interval{ TIME_KEY = 630}
};
如何查询此列表(最好使用 Linq),以获得看起来像的列表像这样:
List<Interval> list = new List<Interval>
{
new Interval{ TIME_KEY = 610},
new Interval{ TIME_KEY = 625}
};
?
编辑:我可能会知道间隔距离应该是多少,但是如果有一种方法可以通过检查数据来确定它,那将是一个巨大的好处!
编辑:更改为数值
I've got an IEnumerable<T>
containing a list of data elements with consistent intervals in one of the properties:
List<Interval> list = new List<Interval>
{
new Interval{ TIME_KEY = 600},
new Interval{ TIME_KEY = 605},
new Interval{ TIME_KEY = 615},
new Interval{ TIME_KEY = 620},
new Interval{ TIME_KEY = 630}
};
How can I query this list (using Linq, preferably), to get a List that looks like this:
List<Interval> list = new List<Interval>
{
new Interval{ TIME_KEY = 610},
new Interval{ TIME_KEY = 625}
};
?
EDIT: I will probably know what the interval distance is supposed to be, but if there's a way to determine it by examing the data, that would be a huge bonus!
EDIT: changed to numeric values
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
看看这个问题,了解选择连续值的扩展方法。从那里,你可以做类似的事情:(
有点伪代码,但我希望你明白我要去哪里。)
但是,当它丢失时,只会生成一个元素......如果你从0 到 20,它不会生成 5、10、15。
为 Henk 的第二个建议添加一些内容:
Have a look at this question for an extension method which selects consecutive values. From there, you could do something like:
(Somewhat pseudocode, but I hope you see where I'm going.)
However, that would only generate one element when it's missing... if you went from 0 to 20, it wouldn't generate 5, 10, 15.
To put some meat on Henk's second suggestion:
一种高效且简单的方法就是使用
foreach
遍历该列表并检测间隙。我认为 5 分钟节拍是固定的?
要使用 LINQ,您可以创建完整列表并找出差异,但这似乎有点过分了。
考虑第二部分,确定间隔:
从您的示例中,可能需要 3 或 4 个值的样本。但即使在检查了所有值之后,您也不能绝对确定。您的示例数据不排除具有大量缺失值的 1 分钟频率。
所以你需要关于这部分的非常好的规格。
An efficient and simple way would be just to go through that list with
foreach
and detect the gaps.I assume that 5 minute tact is fixed?
To use LINQ you could create the full list and find the difference, but that seems overkill.
Considering the 2nd part, determining the interval:
From your example a sample of 3 or 4 values would probably do. But you can not be absolutely sure even after examining all the values. Your example data does not exclude a 1 minute frequency with a lot of missing values.
So you need very good specifications regarding this part.
如果间隔已知,并且您有权访问 Zip 方法(.NET 4 附带):
请注意,这会迭代列表两次,因此如果源是惰性可枚举,则需要先对其进行缓冲。使用
Zip
和Skip
构建是将连续元素投影在一起的快速而肮脏的方法。 Reactive Extensions 的 System.Interactive 库有一个Scan
方法为此,Jon 在另一个答案中展示了可能的实现。这些都不会迭代列表两次,因此它们将是更好的选择。如果要确定间隔,您可以获得最小增量:
不过,我做了一些假设:
它是如何工作的:
a.delta/interval - 1
值,并且每个值都相距一定数量的间隔来自对中的元素存储,因此 ax + i*interval 。SelectMany
负责将所有这些缺失值序列展平为一个。This would work if the interval is known, if you have access to the Zip method (comes with .NET 4):
Note that this iterates the list twice so in case the source is a lazy enumerable, you need to buffer it first. That construction with
Zip
andSkip
is a quick and dirty way of projecting consecutive elements together. Reactive Extensions' System.Interactive library has aScan
method for that and Jon showed a possible implementation in another answer. Neither of those iterates the list twice, so they would be a much better choice.If the interval is to be determined you can get the minimum delta:
There are some assumptions I made though:
How it works:
a.delta/interval - 1
values, and each of these is a certain number of intervals away from the element store in the pair, hencea.x + i*interval
.SelectMany
takes care of flattening all those sequences of missing values together into one.试试这个:
假设初始列表已排序。
Try this:
The initial list is assumed to be sorted.