查找序列中缺失和重叠的数字
假设我们有一个这样的数据结构:
var sequences = new List<Tuple<int, int>>
{
new Tuple<int, int>(1, 10),
new Tuple<int, int>(8, 101),
new Tuple<int, int>(102, 103),
new Tuple<int, int>(104, 104),
new Tuple<int, int>(110, 200)
};
我想从这个集合中得到两个结果:
- 所有缺失的数字(在本例中:105、106、107、108、109)
- 所有重叠的数字(在本例中:8) , 9, 10)
我可以编写一个带有几个循环和辅助集合的算法。这当然可以,但我想知道这是否可以通过 LINQ 和/或其他更简单、更短的算法来实现?
编辑: 上面示例中的我的数据结构表示 5 个序列,第一个包含从 1 到 10 的数字,第二个包含从 8 到 101 的数字,依此类推...因为在生产中序列可能会更大(最多数以百万计),它们不是用实际的集合(例如包含所有数字的列表)表示的,而是用元组表示的,元组表示每个序列的最小和最大数量。
Let's say we have a data structure like this:
var sequences = new List<Tuple<int, int>>
{
new Tuple<int, int>(1, 10),
new Tuple<int, int>(8, 101),
new Tuple<int, int>(102, 103),
new Tuple<int, int>(104, 104),
new Tuple<int, int>(110, 200)
};
I would like to get two results from this collection:
- All the missing numbers (in this example: 105, 106, 107, 108, 109)
- All the overlapping numbers (in this example: 8, 9, 10)
I could write an algorithm with a couple of loops and helper collections. This off course would work, but I wonder if this could be achieved with the help of LINQ and/or other easier and shorter algorithms?
Edit:
My data structure from the example above is representing 5 sequences, the first one containing numbers from 1 to 10, the second one containing the numbers from 8 to 101 and so on... Because in the production the sequences can be much bigger (up to millions), they are not represented with the actual collection (for instance with Lists with all the numbers), but with Tuples, which represents the minimal and the maximal number of each sequence.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
有一些边缘情况我只能假设你希望如何处理。我选择不处理其中之一(在代码中注释)。由于您没有给出如何表示丢失/重叠序列的指示,因此我选择了您自己的格式,使用元组来标识序列的开始和结束。
There's a few edge cases I can only assume how you wish to handle. I've chosen not to handle one of them (commented in the code). Since you've given no indication of how you would want to represent missing/ovrelapping sequences I've chosen your own format using tuples to identify the start and end of the sequence.
你可以通过
You could do it via
我知道这个问题的算法(它是伪代码)。 (复杂度类
O(nlog(n))
其中 n 是元组的计数)因此解决方案是按函数对元组进行排序:
例如元组: (1, 10), (1, 5), (2 , 8) 将排序为:
(1,5)、(1、10)、(2、8)。
下一步是累积这个结果。迭代这个结果并且:
我们知道,如果将迭代下一个元组,第一个数字大于或等于 result.first。代码中的注释会告诉您哪里有重叠和缺失的数字
I know algorithm for this problem (it is pseudoCode). ( Complexity classes
O(nlog(n))
where n is count of tuple)So solution is sort Tuple by function:
so example tuple: (1, 10), (1, 5), (2, 8) will be sort to:
(1,5), (1, 10), (2, 8).
Next step is accumulate this result. Iterate on this result and :
What we know, that if will be iterate next tuple first number is bigger or equals than result.first. Commentate in code tell you where you have overlapping and missing number
试试这个
try this
一次性完成:
In one pass: