用于在一段时间之前驱逐事件的有效数据结构?
我有一个由 ID 和日期组成的对象集合。我希望以一种可以通过 ID 有效查找它们的方式存储这些对象,并删除某个时间点之前发生的所有事件。
我正在考虑使用 HashMap 和 TreeMap,其中 HashMap 保存 ID,TreeMap 存储按日期排序的元素。这使得通过 ID 进行 O(1) 查找,让我可以有效地删除所有旧事件。我还尝试使用日期排序 TreeMap,而不使用 ID 哈希表。
是否有更有效的数据结构来存储信息以有效地支持这些操作?
I have a collection of objects consisting of an ID and a date. I want to store these objects in a way where I can efficiently look them up by ID, and also delete all events that occur before some point in time.
I was thinking about using a HashMap and TreeMap, where the HashMap holds IDs and the TreeMap stores the elements sorted by date. This gives O(1) lookup by ID and lets me efficiently remove all old events. I also tried using a sorted TreeMap of the dates without the hash table of IDs.
Is there an even more efficient data structure for storing information to efficiently support these operations?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
鉴于您想要支持的操作是
我认为您可能希望使用链式哈希表和展开树组合的数据结构。基本思想如下:哈希表将 ID 值映射到相关对象,展开树存储按日期排序的对象。要插入到结构中,您需要在 O(log n) 摊销时间内将元素插入到 ID 哈希表和展开树中。您可以在哈希表中以 O(1) 预期时间进行查找。
根据问题的参数,您可以非常有效地删除某个时间之前出现的所有元素。想法如下。伸展树高效(摊销 O(log n))支持删除树中时间早于某个特定值的所有内容。现在,如果插入到该结构中的条目具有这样的属性,即每当您删除某个时间以下的所有值时,您永远不会在该时间之前插入条目,则可以使用以下删除过程:首先,使用高效的算法删除所有值splay 树中需要在总摊销时间 O(log n) 中删除的条目,然后记录刚刚删除的时间。从那时起,每当您在哈希表中进行查找时,如果您看到时间低于给定时间阈值的元素,则只需将其删除即可。如果您在重新哈希期间执行此操作,那么您可以将从哈希表中删除应删除的所有内容所需的 O(n) 工作分散到需要的某一时刻,从而可以分摊工作。这仍然为您提供了 O(1) 按 ID 查找时间和 O(1) 摊销到哈希表中的插入时间。简而言之,如果您可以做出此假设,您将获得以下运行时间:
希望这有帮助!
Given that the operations you want to support are
I think that you may want to use a data structure that's a combination of a chained hash table and a splay tree. The basic idea is as follows: the hash table maps ID values to the object in question, and the splay tree stores the objects sorted by date. To insert into the structure, you insert the element into both the ID hash table and splay tree in O(log n) amortized time. You can do lookups in O(1) expected time in the hash table.
Depending on the parameters of your problem, you can potentially delete all elements that appear before some time extremely efficiently. The idea is as follows. The splay tree efficiently (amortized O(log n)) supports deleting everything in the tree whose time is before some particular value. Now, if the entries you insert into this structure have the property that whenever you delete all values below some time, you never then insert an entry before that time, you can use the following deletion procedure: first, use the efficient algorithm for deleting all entries from the splay tree that need to be removed in total amortized time O(log n), then record the time that you just deleted. From that point forward, any time you do a lookup in the hash table, if you ever see an element whose time is below the given time threshold, you just delete it. If you do this during a rehash, then you can spread the O(n) work required to remove everything from the hash table that should be deleted to just one instant in time where it's required, and so can amortize the work. This still gives you O(1) lookup time by ID and O(1) amortized insertion time into the hash table. In short, if you can make this assumption, you'll get the following runtimes:
Hope this helps!