关于由外部系统删除 O(1) 列表的迭代器使用的架构 C++/STL 问题
这是一个非常简单的架构问题,但它多年来一直困扰着我。
无论如何,对我来说,使用列表的全部意义在于它的插入/删除时间复杂度为 O(1)。 进行 O(1) 删除的唯一方法是使用擦除()的迭代器。 获取迭代器的唯一方法是从初始 insert() 中保留它或通过迭代找到它。
那么,要传递什么;迭代器还是指针?
看起来,如果快速删除很重要,例如某种经常更改的大型列表,您应该传递迭代器,并且如果您不担心在列表中查找项目的时间,然后传递指针。
这是一个典型的简化示例:
在这个示例中,我们有一些名为 Foo 的类型。 Foo 很可能是一个基类指针,但为了简单起见,它不在这里。
然后我们有 FooManger,它保存了一个shared_ptr、FooPtr 的列表。一旦对象被传递给管理器,管理器就对其生命周期负责。
现在,addFoo() 返回什么? 如果我返回 FooPtr,那么我永远无法在 O(1) 中将其从列表中删除,因为我必须在列表中找到它。 如果我返回一个 std::list::iterator FooPtrListIterator,那么在任何需要删除 FooPtr 的地方我都可以,只需取消引用迭代器即可。
在这个例子中,我有一个 Foo 的人为示例,它可以在某些情况下自杀,Foo::killWhenConditionMet()。
想象一下某个 Foo 的计时器正在计时至 0,此时它需要要求管理器删除自身。问题在于“this”是一个裸露的 Foo*,因此删除自身的唯一方法是使用原始指针调用 FooManager::eraseFoo()。现在,管理器必须搜索对象指针以获取迭代器,以便可以将其从列表中删除并销毁。
解决这个问题的唯一方法是将迭代器存储在对象中。即 Foo 有一个 FooPtrListIterator 作为成员变量。
struct Foo;
typedef boost::shared_ptr<Foo> FooPtr;
typedef std::list<FooPtr> FooPtrList;
typedef FooPtrList::iterator FooPtrListIterator;
struct FooManager
{
FooPtrList l;
FooPtrListIterator addFoo(Foo *foo) {
return l.insert(l.begin(), FooPtr(foo));
}
void eraseFoo(FooPtrListIterator foo) {
l.erase(foo);
}
void eraseFoo(Foo *foo) {
for (FooPtrListIterator it=l.begin(), ite=l.end(); it!=ite; ++it) {
if ((*it).get()==foo){
eraseFoo(it);
return;
}
}
assert("foo not found!");
}
};
FooManager g_fm;
struct Foo
{
int _v;
Foo(int v):_v(v) {
}
~Foo() {
printf("~Foo %d\n", _v);
}
void print() {
printf("%d\n", _v);
}
void killWhenConditionMet() {
// Do something that will eventually kill this object, like a timer
g_fm.eraseFoo(this);
}
};
void printList(FooPtrList &l)
{
printf("-\n");
for (FooPtrListIterator it=l.begin(), ite=l.end(); it!=ite; ++it) {
(*it)->print();
}
}
void test2()
{
FooPtrListIterator it1=g_fm.addFoo(new Foo(1));
printList(g_fm.l);
FooPtrListIterator it2=g_fm.addFoo(new Foo(2));
printList(g_fm.l);
FooPtrListIterator it3=g_fm.addFoo(new Foo(3));
printList(g_fm.l);
(*it2)->killWhenConditionMet();
printList(g_fm.l);
}
所以,我的问题是: 1. 如果一个对象需要删除自身,或者让其他系统删除它,在 O(1) 中,我是否必须在对象内部存储一个对象的迭代器?如果是这样,是否存在与迭代器由于其他容器迭代而变得无效有关的问题?
是否有其他方法可以做到这一点?
作为一个附带问题,有谁知道为什么“push*”stl 容器操作不返回结果迭代器,这意味着必须诉诸“insert*”。
拜托,没有答案说“不要预先优化”,这让我发疯。 ;) 这是一个架构问题。
This is a pretty straightforward architectural question, however it's been niggling at me for ages.
The whole point of using a list, for me anyway, is that it's O(1) insert/remove.
The only way to have an O(1) removal is to have an iterator for erase().
The only way to get an iterator is to keep hold of it from the initial insert() or to find it by iteration.
So, what to pass around; an Iterator or a pointer?
It would seem that if it's important to have fast removal, such as some sort of large list which is changing very frequently, you should pass around an iterator, and if you're not worried about the time to find the item in the list, then pass around the pointer.
Here is a typical cut-down example:
In this example we have some type called Foo. Foo is likely to be a base class pointer, but it's not here for simplicity.
Then we have FooManger, which holds a list of shared_ptr, FooPtr . The manager is responsible for the lifetime of the object once it's been passed to it.
Now, what to return from addFoo()?
If I return a FooPtr then I can never remove it from the list in O(1), because I will have to find it in the list.
If I return a std::list::iterator, FooPtrListIterator, then anywhere I need to remove the FooPtr I can, just by dereferencing the iterator.
In this example I have a contrived example of a Foo which can kill itself under some circumstance, Foo::killWhenConditionMet().
Imagine some Foo that has a timer which is ticking down to 0, at which point it needs to ask the manager to delete itself. The trouble is that 'this' is a naked Foo*, so the only way to delete itself, is to call FooManager::eraseFoo() with a raw pointer. Now the manager has to search for the object pointer to get an iterator so it can be erased from the list, and destroyed.
The only way around that is to store the iterator in the object. i.e Foo has a FooPtrListIterator as a member variable.
struct Foo;
typedef boost::shared_ptr<Foo> FooPtr;
typedef std::list<FooPtr> FooPtrList;
typedef FooPtrList::iterator FooPtrListIterator;
struct FooManager
{
FooPtrList l;
FooPtrListIterator addFoo(Foo *foo) {
return l.insert(l.begin(), FooPtr(foo));
}
void eraseFoo(FooPtrListIterator foo) {
l.erase(foo);
}
void eraseFoo(Foo *foo) {
for (FooPtrListIterator it=l.begin(), ite=l.end(); it!=ite; ++it) {
if ((*it).get()==foo){
eraseFoo(it);
return;
}
}
assert("foo not found!");
}
};
FooManager g_fm;
struct Foo
{
int _v;
Foo(int v):_v(v) {
}
~Foo() {
printf("~Foo %d\n", _v);
}
void print() {
printf("%d\n", _v);
}
void killWhenConditionMet() {
// Do something that will eventually kill this object, like a timer
g_fm.eraseFoo(this);
}
};
void printList(FooPtrList &l)
{
printf("-\n");
for (FooPtrListIterator it=l.begin(), ite=l.end(); it!=ite; ++it) {
(*it)->print();
}
}
void test2()
{
FooPtrListIterator it1=g_fm.addFoo(new Foo(1));
printList(g_fm.l);
FooPtrListIterator it2=g_fm.addFoo(new Foo(2));
printList(g_fm.l);
FooPtrListIterator it3=g_fm.addFoo(new Foo(3));
printList(g_fm.l);
(*it2)->killWhenConditionMet();
printList(g_fm.l);
}
So, the questions I have are:
1. If an object needs to delete itself, or have some other system delete it, in O(1), do I have to store an iterator to object, inside the object? If so, are there any gotchas to do with iterators becoming invalid due other container iterations?
Is there simply another way to do this?
As a side question, does anyone know why and of the 'push*' stl container operations don't return the resultant iterator, meaning one has to resort to 'insert*'.
Please, no answers that say "don't pre-optimise", it drives me nuts. ;) This is an architectural question.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
C++ 标准在其 [list.modifiers] 部分中表示,任何列表插入操作“不会影响迭代器和引用的有效性”,并且任何删除操作“仅使迭代器和对已擦除元素的引用无效” ”。因此,保留迭代器是安全的。
将迭代器保留在对象内部似乎也是明智的。特别是如果您不将它们称为迭代器,而是将它们命名为 FooManagerHandlers 之类的名称,它们由删除函数以不透明的方式进行处理。事实上,您不存储“迭代器”,而是将对象的“代表”存储在有组织的结构中。这些代表用于定义该结构内对象的位置。这是一个独立的、相当高级的概念,实现它并没有什么不合逻辑的。
然而,使用列表的目的不仅仅是 O(1) 插入/删除,而且还可以保持元素按顺序排列。如果您不需要任何顺序,那么您可能会发现哈希表更有用。
C++ standard in its [list.modifiers] section says that any list insertion operation "does not affect the validity of iterators and references", and any removal operation "invalidates only the iterators and references to the erased elements". So keeping iterators around would be safe.
Keeping iterators inside the objects also seems sane. Especially if you don't call them iterators, but rather name like
FooManagerHandlers
, which are processed by removal function in an opaque way. Indeed, you do not store "iterators", you store "representatives" of objects in an organized structure. These representatives are used to define a position of an object inside that structure. This is a separate, quite a high-level concept, and there's nothing illogical in implementing it.However, the point of using lists is not just O(1) insert/remove, but also keeping elements in an order. If you don't need any order, then you would probably find hash tables more useful.
我在对象中存储迭代器时看到的一个问题是,您必须小心从其他迭代器中删除该对象,因为您的对象析构函数不知道它是从哪里销毁的,因此您最终可能会得到一个无效的迭代器析构函数。
Push* 不返回迭代器的原因是它与 pop* 相反,允许您将容器视为堆栈、队列或双端队列。
The one problem I see with storing the iterator in the object is that you must be careful of deleting the object from some other iterator, as your objects destructor does not know where it was destroyed from, so you can end up with an invalid iterator in the destructor.
The reason that push* does not return an iterator is that it is the inverse of pop*, allowing you to treat your container as a stack, queue, or deque.