关于 std::vector 的两个简短问题

发布于 2024-09-10 04:24:05 字数 855 浏览 4 评论 0原文

创建向量时，它具有默认的分配大小（可能这不是正确的术语，也许是步长？）。当元素数量达到此大小时，将调整向量的大小。这个大小编译器特定吗？我能控制它吗？这是个好主意吗？

重复调用vector::size()重新计算元素数量（O(n)计算）或者该值是否存储在某处（O (1) 查找）。例如，在下面的代码中

// 按空格分割给定字符串
矢量<字符串> split( const string& s )
{
    矢量<字符串>代币；
    字符串::大小类型 i, j;
    我 = 0;
    while ( i != s.size() ) {
        // 忽略前导空格
        while ( isspace(s[i]) && i != s.size() ) {
            我++;
        }
        // 找到一个单词，现在找到它的结尾
        j = 我；
        while ( !isspace(s[j]) && j != s.size() ) {
            j++;
        }
        // 如果我们找到一个单词，将其添加到向量中
        如果（我！= j）{ 
            tokens.push_back( s.substr(i, ji) );
            我=j；
        }
    }
    返回代币；
}

假设 s 可能非常大，我是否应该仅调用 s.size() 一次并存储结果？

谢谢！

原文

When a vector is created it has a default allocation size (probably this is not the right term to use, maybe step size?). When the number of elements reaches this size, the vector is resized. Is this size compiler specific? Can I control it? Is this a good idea?

Do repeated calls to vector::size() recount the number of elements (O(n) calculation) or is this value stored somewhere (O(1) lookup). For example, in the code below

// Split given string on whitespace
vector<string> split( const string& s )
{
    vector<string> tokens;
    string::size_type i, j;
    i = 0;
    while ( i != s.size() ) {
        // ignore leading blanks
        while ( isspace(s[i]) && i != s.size() ) {
            i++;
        }
        // found a word, now find its end
        j = i;
        while ( !isspace(s[j]) && j != s.size() ) {
            j++;
        }
        // if we found a word, add it to the vector
        if ( i != j ) { 
            tokens.push_back( s.substr(i, j-i) );
            i = j;
        }
    }
    return tokens;
}

assuming s can be very large, should I call s.size() only once and store the result?

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

回首观望 2024-09-17 04:24:05

在大多数情况下，您应该保留分配，除非您提前知道项目数量，以便可以保留正确的空间量。

至少在我所知的每种情况下，std::vector::size() 只是返回一个存储的值，因此它具有恒定的复杂性。理论上，C++ 标准允许它这样做。有理由允许其他一些容器（主要是 std::list ），而不是为这些容器制定特殊情况，他们只是建议所有容器使用恒定时间，而不是要求任何容器使用恒定时间。我无法想象一个能够计算元素数量的 vector::size ——我几乎不存在这样的东西。

PS，一种更简单的方法可以完成上面的代码，如下所示：

std::vector<string> split(std::string const &input) {
    vector<string> ret;
    istringstream buffer(input);

    copy(istream_iterator<string>(input),
         istream_iterator<string>(),
         back_inserter(ret));

    return ret;
}

编辑：IMO，The C++ Standard Library，作者：Nicolai Josuttis，是此类内容的绝佳参考。

In most cases, you should leave the allocation alone unless you know the number of items ahead of time, so you can reserve the correct amount of space.

At least in every case of which I'm aware, std::vector::size() just returns a stored value, so it has constant complexity. In theory, the C++ standard allows it to do otherwise. There are reasons to allow otherwise for some other containers, primarily std::list, and rather than make a special case for those, they simply recommend constant time for all containers instead of requiring it for any. I can't quite imagine a vector::size that counted elements though -- I'm pretty no such thing has ever existed.

P.S., an easier way to do what your code above does, is something like this:

std::vector<string> split(std::string const &input) {
    vector<string> ret;
    istringstream buffer(input);

    copy(istream_iterator<string>(input),
         istream_iterator<string>(),
         back_inserter(ret));

    return ret;
}

Edit: IMO, The C++ Standard Library, by Nicolai Josuttis is an excellent reference on such things.

回复收藏 0 原文

内心荒芜 2024-09-17 04:24:05

容量增量的实际大小取决于实现，但它必须（大致）呈指数级才能支持容器的复杂性要求。例如，Visual C++ 标准库将准确分配前几个元素（如果我没记错的话，五个）所需的空间，然后以指数方式增加大小。

大小必须以某种方式存储在向量中，否则它不知道序列的末尾在哪里！然而，它不一定存储为整数。 Visual C++ 实现（再次作为示例）存储三个指针：

一个指向基础数组开头的指针、
一个指向序列当前结尾的指针以及
一个指向基础数组结尾的指针。

大小可以由（1）和（2）计算得出；容量可以根据 (1) 和 (3) 计算。

其他实现可能会以不同的方式存储信息。

回复收藏 0 原文

无所的.畏惧 2024-09-17 04:24:05

它是特定于库的。您也许能够控制增量分配，但也可能不能。
大小已存储，因此检索速度非常快（恒定时间）。不然还能怎样呢？ C 通常无法知道内存位置是否是“真实数据”。
由于

回复收藏 0 原文

如痴如狂 2024-09-17 04:24:05

调整大小机制通常是固定的。（大多数编译器在达到限制时将向量的大小加倍。）C++ 标准没有指定控制此行为的方法。
每当您插入/删除元素时，大小都会在内部更新，并且当您调用 size() 时，它会立即返回。所以是的，它的时间复杂度为O(1)。

回复收藏 0 原文

黎歌 2024-09-17 04:24:05

与您的实际问题无关，但这是一种更“STL”的方式来完成您正在做的事情：

vector<string> split(const string& s)
{
    istringstream stream(s);
    istream_iterator<string> iter(stream), eos;
    vector<string> tokens;
    copy(iter, eos, back_inserter(tokens));
    return tokens;
}

Unrelated to your actual questions, but here's a more "STL" way of doing what you're doing:

vector<string> split(const string& s)
{
    istringstream stream(s);
    istream_iterator<string> iter(stream), eos;
    vector<string> tokens;
    copy(iter, eos, back_inserter(tokens));
    return tokens;
}

回复收藏 0 原文