关于 std::vector 的两个简短问题
- 创建向量时,它具有默认的分配大小(可能这不是正确的术语,也许是步长?)。当元素数量达到此大小时,将调整向量的大小。这个大小编译器特定吗?我能控制它吗?这是个好主意吗?
重复调用
vector::size()
重新计算元素数量(O(n)
计算)或者该值是否存储在某处(O (1)
查找)。例如,在下面的代码中// 按空格分割给定字符串 矢量<字符串> split( const string& s ) { 矢量<字符串>代币; 字符串::大小类型 i, j; 我 = 0; while ( i != s.size() ) { // 忽略前导空格 while ( isspace(s[i]) && i != s.size() ) { 我++; } // 找到一个单词,现在找到它的结尾 j = 我; while ( !isspace(s[j]) && j != s.size() ) { j++; } // 如果我们找到一个单词,将其添加到向量中 如果(我!= j){ tokens.push_back( s.substr(i, ji) ); 我=j; } } 返回代币; }
假设 s
可能非常大,我是否应该仅调用 s.size()
一次并存储结果?
谢谢!
- When a vector is created it has a default allocation size (probably this is not the right term to use, maybe step size?). When the number of elements reaches this size, the vector is resized. Is this size compiler specific? Can I control it? Is this a good idea?
Do repeated calls to
vector::size()
recount the number of elements (O(n)
calculation) or is this value stored somewhere (O(1)
lookup). For example, in the code below// Split given string on whitespace vector<string> split( const string& s ) { vector<string> tokens; string::size_type i, j; i = 0; while ( i != s.size() ) { // ignore leading blanks while ( isspace(s[i]) && i != s.size() ) { i++; } // found a word, now find its end j = i; while ( !isspace(s[j]) && j != s.size() ) { j++; } // if we found a word, add it to the vector if ( i != j ) { tokens.push_back( s.substr(i, j-i) ); i = j; } } return tokens; }
assuming s
can be very large, should I call s.size()
only once and store the result?
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
在大多数情况下,您应该保留分配,除非您提前知道项目数量,以便可以保留正确的空间量。
至少在我所知的每种情况下,std::vector::size() 只是返回一个存储的值,因此它具有恒定的复杂性。理论上,C++ 标准允许它这样做。有理由允许其他一些容器(主要是 std::list ),而不是为这些容器制定特殊情况,他们只是建议所有容器使用恒定时间,而不是要求任何容器使用恒定时间。我无法想象一个能够计算元素数量的
vector::size
——我几乎不存在这样的东西。PS,一种更简单的方法可以完成上面的代码,如下所示:
编辑:IMO,The C++ Standard Library,作者:Nicolai Josuttis,是此类内容的绝佳参考。
In most cases, you should leave the allocation alone unless you know the number of items ahead of time, so you can reserve the correct amount of space.
At least in every case of which I'm aware,
std::vector::size()
just returns a stored value, so it has constant complexity. In theory, the C++ standard allows it to do otherwise. There are reasons to allow otherwise for some other containers, primarilystd::list
, and rather than make a special case for those, they simply recommend constant time for all containers instead of requiring it for any. I can't quite imagine avector::size
that counted elements though -- I'm pretty no such thing has ever existed.P.S., an easier way to do what your code above does, is something like this:
Edit: IMO, The C++ Standard Library, by Nicolai Josuttis is an excellent reference on such things.
容量增量的实际大小取决于实现,但它必须(大致)呈指数级才能支持容器的复杂性要求。例如,Visual C++ 标准库将准确分配前几个元素(如果我没记错的话,五个)所需的空间,然后以指数方式增加大小。
大小必须以某种方式存储在向量中,否则它不知道序列的末尾在哪里!然而,它不一定存储为整数。 Visual C++ 实现(再次作为示例)存储三个指针:
大小可以由(1)和(2)计算得出;容量可以根据 (1) 和 (3) 计算。
其他实现可能会以不同的方式存储信息。
The actual size of the capacity increment is implementation-dependent, but it has to be (roughly) exponential to support the container's complexity requirements. As an example, the Visual C++ standard library will allocate exactly the space required for the first few elements (five, if I recall correctly), then increases the size exponentially after that.
The size has to be stored somehow in the vector, otherwise it doesn't know where the end of the sequence is! However, it may not necessarily be stored as an integer. The Visual C++ implementation (again, as an example) stores three pointers:
The size can be computed from (1) and (2); the capacity can be computed from (1) and (3).
Other implementations might store the information differently.
它是特定于库的。您也许能够控制增量分配,但也可能不能。
大小已存储,因此检索速度非常快(恒定时间)。不然还能怎样呢? C 通常无法知道内存位置是否是“真实数据”。
由于
It's library-specific. You might be able to control the incremental allocation, but you might not.
The size is stored, so it is very fast (constant time) to retrieve. How else could it work? C has no way of knowing in general whether a memory location is "real data" or not.
调整大小机制通常是固定的。 (大多数编译器在达到限制时将向量的大小加倍。)C++ 标准没有指定控制此行为的方法。
每当您插入/删除元素时,大小都会在内部更新,并且当您调用
size()
时,它会立即返回。所以是的,它的时间复杂度为O(1)。The resizing mechanism is usually fixed. (Most compilers double the size of the vector when it reaches the limit.) The C++ standard specifies no way to control this behaviour.
The size is internally updated whenever you insert/remove elements and when you call
size()
, it's returned immediately. So yes, it's O(1).与您的实际问题无关,但这是一种更“STL”的方式来完成您正在做的事情:
Unrelated to your actual questions, but here's a more "STL" way of doing what you're doing:
一般来说,这是特定于库的行为,但是您可能可以通过指定自定义分配器来影响此行为,这是一项艰巨的工作。
大多数实现将大小存储为成员。这是单个内存读取。
In general, this is a library-specific behavior, but you may be able to influence this behavior by specifying a custom allocator, which is non-trivial work.
Most implementations store the size as a member. It's a single memory read.