在 C++ 中做这样的事情是不是一个坏主意:
struct Node
{
string name;
vector<Node> children;
};
我问是因为在我看来,无论出于何种原因调整 children
的大小都可能导致指数级复制级联。但另一方面,vector
也有其自身的问题,像vector>
这样的东西虽然安全,但内存局部性很差(由于太多的间接),当您尝试在内存中构建一棵巨大的树时,这会开始造成伤害。
那么,总的来说,在 C++ (03) 中拥有一个自己类型的容器是一个坏主意吗?是否有其他原因避免(或更喜欢)这种习惯用法,而不是使用指针容器?
Is it a bad idea in C++ to do something like:
struct Node
{
string name;
vector<Node> children;
};
I'm asking because it looks to me like resizing children
for any reason could potentially cause an exponential copying cascade. But on the other hand, vector<Node*>
has its own problems, and things like vector<shared_ptr<Node>>
, while safe, have poor memory locality (due to too many indirections), which starts hurting when you're trying to e.g. build a gigantic tree in memory.
So, in general, is it a bad idea to have a container of your own type in C++ (03)? Are there other reasons to avoid (or prefer) this idiom, instead of using containers of pointers?
发布评论
评论(4)
一般来说,这还不错。这确实取决于上下文。
在这种情况下,与重要的调整大小操作和级联相比,共享指针的间接寻址不应该很重要(假设这将成为问题)。
如果您可以在填充向量之前保留适当的大小或使复制变得微不足道,那么这里就不用担心太多了。
如果它真的很大,您可以将节点建立在外部存储的基础上,该存储处理(并可能共享)节点。那么你的调整大小操作将变得微不足道,因为你的子元素都是指针。
根据节点的大小,这:
可能比共享节点更好。这实际上取决于大小、深度以及调整大小的频率。容器还不错,但您必须知道程序将如何执行才能选择最佳策略(分析也可以在这里提供帮助,即使您认为自己知道最快的策略)。
如果您知道有很多对象并且需要调整很多大小,那么快速的外部存储将是一个不错的选择;贾尔夫的回答概述了一个很好的策略。
如果您有大量突变,您还可以在后备存储上使用节点向量的
列表
,以及指向子级列表元素的指针。对这些更复杂的实现的支持也比您当前的设计花费更多的时间,但如果您确实需要执行大量突变,那么它们值得尝试。实施和维护的简单性将为您的原始设计带来好处。
如果您的图确实不会变得很大,另一种方法是使用自定义分配器,它要么引用节点的后备存储,要么比默认分配器收缩得更少。
It's not bad as a general rule. It really comes down to the context.
In this case, the indirections of a shared pointer should not be significant compared to nontrivial resize operations, and the cascading (assuming that will be a problem).
If you can reserve the proper sizes prior to populating the vector or make copying trivial, then there's not much to worry about here.
If it's really gigantic, you could base your nodes of an external store, which handled (and potentially shared) the nodes. Then your resizing ops would be trivial, as your
children
would all be pointers.Depending on the size of the Nodes, this:
may be better than shared nodes. It really depends on the size, depth, how often you resize. A container isn't bad, but you will have to know how your program will execute to choose the best strategy (profiling can also help here, even in cases when you figure you know the fastest).
If you know you have a lot of objects and a lot of resizing to do, then an fast external store will be a good choice; jalf's answer outlines a good strategy for this.
You could also use a
list
of nodevector
s on a backing store if you will have a lot of mutations, and pointers to list elements for the children.Support for these more complex implementations also takes more time than your current design, but they are worth trying if you do need to perform a lot of mutations. Simplicity to implement and maintain would then be a bonus to your original design.
If your graph really doesn't grow very large, another approach would be to use custom allocators which either referred to a backing store of nodes, or shrank less often than the default allocators.
取决于几件事:
你是对的,它可能有这种效果,但前提是你的数据结构足够深和足够宽,级联实际上是显而易见的。
向您的类添加移动语义可以完全解决问题,但您也可以修改数据结构。一种选择可能是根本不将向量存储在节点内。也许保留一个在所有节点之间共享的大向量,并让每个节点存储两个指向属于它的元素范围的迭代器。
Depends on several things:
Node
class)You're right, it could potentially have this effect, but only if your data structure is deep and wide enough for the cascade to actually be noticeable.
Adding move semantics to your class would solve the problem entirely, but alternatively, you could modify the data structure. One option might be to not store the vectors inside nodes at all. Perhaps keep one single large vector which is shared between all nodes, and let each node store two iterators pointing to the range of elements that belong to it.
让
children
成为指向节点向量的指针不会有助于复制级联吗?Won't making
children
a pointer to a vector of Nodes help with the copying cascade?如果
vector
是std::vector
,则此代码具有与[res.on.functions]
状态相同的未定义行为(两者中的措辞相似) C++03 和 C++11):如果您想这样做,请使用明确允许递归实例化的
vector
实现,例如 boost::container::vector。If
vector
isstd::vector
, then this code has undefined behaviour as[res.on.functions]
states (with similar wording in both C++03 and C++11):If you want to do this, use an implementation of
vector
that explicitly allows recursive instantiations, such as boost::container::vector.