在 O(n) 时间内找到链表中条目的索引
我有一个场景,我将更改列表推送到另一个系统。每个列表包含零个或多个插入、更新或删除的通知。
插入很容易;该通知包含目标索引和指向该项目的指针。更新很容易;我传递了一个指向该项目的指针。
删除看起来很简单;我需要传递要删除的项目的索引,但是我如何知道索引呢?索引从零开始并且必须是连续的,但我在插入时弥补它们。所以我需要跟踪我为每个项目建立的索引。
例如,我可以使用地图来做到这一点:std::map
,但是当我删除一个项目时,我必须重新编号它后面的所有内容,这是 O(N)。
这些项目列表将大到 O(N) 迭代不可接受的程度。我确信这个问题已经解决了,我只是不知道解决方案会被称为什么。搜索与“链接列表”相关的任何内容都会产生大量噪音。
一种可能的解决方案是跳跃列表,其中子列表中的每个节点都知道它跳过主列表中的多少个节点,并且由于搜索跳跃列表的时间复杂度为 O(log N),因此我们可以随时跟踪并找到索引O(log N) 并删除 O(log N) 中的项目。
然而,在这里实现跳过列表似乎有点矫枉过正……有更简单的解决方案吗?
编辑:
感谢大家的建议,但我想我已经说服自己跳过列表是解决此问题的正确方法。
I have a scenario where I'm pushing change-lists to another system. Each list contains zero or more inserted, updated or deleted notifications.
Inserting is easy; the notification contains the target index and a pointer to the item. Updating is easy; I pass a pointer to the item.
Deleting seems straight-forward; I need to pass the index of the item to delete, but how do I know the index? Indexes start at zero and must be contiguous, but I make them up at insertion time. So I need to keep track of the index I make up for each item.
I can do this with, for example, a map: std::map<item*, int>
, but then when I remove an item, I have to go re-number everything past it, which is O(N).
These lists of items are going to be large to the point where O(N) iteration is not acceptable. I'm sure this problem has been solved, I just don't know what the solution would be called. Searching for anything related to "linked list" creates a ton of noise.
One possible solution is a skip-list, where each node in the sublists knows how many nodes in the main list it skips, and since searching a skip list is O(log N) we can keep track as we go and find the index in O(log N) and also delete items in O(log N).
However implementing a skip-list seems like overkill here... is there a simpler solution?
EDIT:
Thanks all for your suggestions, but I think I've convinced myself the skip list is the right way to solve this problem here.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
请参阅手指树:一种简单的通用数据结构作者:辛兹和帕特森。
另请参阅 MarkCC 关于手指树的博客文章中的精美插图。
See Finger Trees: A Simple General-purpose Data Structure by Hinze and Paterson.
See also the nice illustrations in MarkCC's blog post on finger trees.
编辑:我以前的解决方案有缺陷 std::vector::erase() 在移动元素时是线性的。我的新建议将我之前的评论延伸到你的问题。
如果您只使用列表中的指针,则可以在对指针调用删除后将指针设置为 0,从而保持索引/键有效。然后,您应该能够使用越来越大的索引,直到下一个索引超出
std::numeric_limits::max()
。然后,当列表的较大部分包含设置为零的未使用指针元素时,对通信通道两侧的零指针执行同步清理,然后重新计算索引。我不知道有什么好的启发式方法,但您可以跟踪零指针的数量,并将其与总体列表大小进行比较。
简而言之,由于索引的计算是 O(n),因此请延迟它,直到绝对需要为止。
edit: My previous solution was flawed std::vector::erase() is linear when moving elements. My new suggestion extends my previous comment to your question.
If you just work with pointers in the list, you can set the pointer to 0 after calling delete on the pointer, keeping the indices/keys valid. Then you should be able to use increasingly larger indices until the next index goes beyond
std::numeric_limits<int>::max()
.Then, when a larger portion of your list contains unused pointer elements set to zero, perform a synchronized cleanup of zero-pointers on both sides of the communication channel, followed by a recalculation of the indices. I don't know a good heuristic for this off the cuff, but you could keep track of the number of zero pointers, and compare it to the overall list size.
In fewer words, since calculation of the indices is O(n), delay it until you absolutely have to.
当您在
std::map
中进行查找时,您不能保留删除历史记录并对此进行补偿吗?我的意思是
std::map
中的索引表示项目的原始索引,然后您有一个辅助映射std::map
它存储给定索引被删除的次数?当然,这个速度取决于历史记录的大小。
/AB
Couldn't you keep the deletion history and compensate for this when you do the lookup in the
std::map<item*, int>
?What I mean is that the index in the
std::map
represents the original index of the item and then you have an auxillary mapstd::map<int, int>
which stores how many times a given index has been deleted?The speed of this will of course depend on the size of the history.
/A.B.
删除的频率是多少?我正在考虑使用
std::map
保留您的解决方案,但不是在删除时更新地图,而是将链接列表中的项目替换为“NULL”项目确保查找图中的索引仍然有效。如果您会看到频繁的删除并且有可能会耗尽内存,那么这可能不是一个好的解决方案。另外,您可以执行此操作,并使用 reindex() 方法从链接列表中删除任何 NULL 项,并为所有项分配新索引。旁注 1:
您不能像更新一样将指针传递给要删除的项目吗?如果您执行此操作并使用双链表,则可以执行删除操作很容易在 O(1) 内完成。
旁注2:
考虑使用
boost::unordered_map
超过std::map
。How often will deletion occur? I'm thinking of keeping your solution using
std::map<item*, int>
, but instead of updating the map upon deletion replacing the item in the linked list with a "NULL"-item to ensure that the indices in your lookupmap remains valid. This may not be a good solution if you will see frequent deletions and there's a chance you will run out of memory. Also, you could do this and have a reindex()-method that removes any NULL item from the linked list and assigns new indices to all items.Sidenote 1:
Can't you pass the pointer to item being deleted as you do in update? If the you do this and use a double-linked list the deletion operation could be performed easily in O(1).
Sidenote 2:
Consider using
boost::unordered_map
overstd::map
.跳过列表是正确的解决方案。
Skip List is the correct solution.