按日期时间进行集合搜索的最快方法
我有一个包含 10 个键的字典,每个键都有一个最多包含 30,000 个值的列表。这些值包含 DateTime 属性。
我经常需要提取其中一个键的一小部分子集,例如 30 - 60 秒的日期范围。
做到这一点很容易,但让它快速运行却并非如此。查询内存中数据的最有效方法是什么?
多谢。
I have a Dictionary containing 10 keys, each with a list containing up to 30,000 values. The values contain a DateTime property.
I frequently need to extract a small subset of one of the keys, like a date range of 30 - 60 seconds.
Doing this is easy, but getting it to run fast is not so. What would be the most efficient way to query this in-memory data?
Thanks a lot.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
首先按日期对列表进行排序,然后通过二分搜索找到所需的项目(即 k 项)并返回它们,找到搜索到的项目的时间复杂度为 O(log(n)),因为您需要找到第一个和最后一个索引。返回它们的时间复杂度为 O(K),总共为 O(K+log(n))
Sort lists by date at the first, then find your required items by binary search (i.e k item) and return them, finding the searched item is O(log(n)) because you need find first and last index. returning them is O(K) in all It's O(K+log(n))
1) 保留字典,但使用
SortedList
而不是字典值列表,按 DateTime 属性排序2) 实现二分搜索以查找排序列表中范围的上下边缘,其中为您提供索引。
3)只需使用
Sortedlist.Values.Skip(lowerIndex).Take(upperIndex - lowerIndex)
选择范围内的值1) Keep the dictionary, but use
SortedList
instead of a list for value of dictionaries, sorted by DateTime property2) Implement a binary search to find the upper and lower edges in your range in the sorted list which gives you indexes.
3) Just select values in the range using
Sortedlist.Values.Skip(lowerIndex).Take(upperIndex - lowerIndex)
回复 Aliostad: 我不认为如果集合列表是链表,bsearch 将不起作用。仍然需要 O(n)
In reply to Aliostad: I don't think bsearch will not work if the list of the collection is a linked list. It still takes O(n)
最快的方法是组织数据,以便根据您要搜索的内容建立索引。目前您已经按键对其进行了索引,但您想按日期进行搜索。我认为如果您希望能够搜索的话,您最好按日期对其进行索引。
我会保留两本字典,一本像现在一样建立索引,另一本按日期对项目建立索引。我会决定一个时间范围(比如 1 分钟),并根据每个对象发生的分钟将其添加到列表中,然后将每个列表添加到该分钟的键下的字典中。然后,当您需要特定时间范围内的数据时,生成相关分钟并从字典中获取列表。但这依赖于您能够从对象中了解其他字典中的键。
the fastest way will be to organize the data so it is indexed by the thing you want to search on. Currently you have it indexed by key, but you want to search by date. I think you would be best indexing it by date, if that is what you want to be able to search on.
I would keep 2 dictionaries, one indexed as you do now and one where the items are indexed by date. i would decide on a time frame (say 1 minute) and add each object to a list based on the minute it happens in and then add each list to the dictionary under the key of that minute. then when you want the data for a particular time frame, generate the relevant minute(s) and get the list(s) from the dictionary. This relies on you being able to know the key in the other dictionary from the objects though.