关于 std::vector 的两个简短问题

发布于 2024-09-10 04:24:05 字数 855 浏览 4 评论 0原文

  1. 创建向量时,它具有默认的分配大小(可能这不是正确的术语,也许是步长?)。当元素数量达到此大小时,将调整向量的大小。这个大小编译器特定吗?我能控制它吗?这是个好主意吗?
  2. 重复调用vector::size()重新计算元素数量(O(n)计算)或者该值是否存储在某处(O (1) 查找)。例如,在下面的代码中

    // 按空格分割给定字符串
    矢量<字符串> split( const string& s )
    {
        矢量<字符串>代币;
        字符串::大小类型 i, j;
        我 = 0;
        while ( i != s.size() ) {
            // 忽略前导空格
            while ( isspace(s[i]) && i != s.size() ) {
                我++;
            }
            // 找到一个单词,现在找到它的结尾
            j = 我;
            while ( !isspace(s[j]) && j != s.size() ) {
                j++;
            }
            // 如果我们找到一个单词,将其添加到向量中
            如果(我!= j){ 
                tokens.push_back( s.substr(i, ji) );
                我=j;
            }
        }
        返回代币;
    }
    

假设 s 可能非常大,我是否应该仅调用 s.size() 一次并存储结果?

谢谢!

  1. When a vector is created it has a default allocation size (probably this is not the right term to use, maybe step size?). When the number of elements reaches this size, the vector is resized. Is this size compiler specific? Can I control it? Is this a good idea?
  2. Do repeated calls to vector::size() recount the number of elements (O(n) calculation) or is this value stored somewhere (O(1) lookup). For example, in the code below

    // Split given string on whitespace
    vector<string> split( const string& s )
    {
        vector<string> tokens;
        string::size_type i, j;
        i = 0;
        while ( i != s.size() ) {
            // ignore leading blanks
            while ( isspace(s[i]) && i != s.size() ) {
                i++;
            }
            // found a word, now find its end
            j = i;
            while ( !isspace(s[j]) && j != s.size() ) {
                j++;
            }
            // if we found a word, add it to the vector
            if ( i != j ) { 
                tokens.push_back( s.substr(i, j-i) );
                i = j;
            }
        }
        return tokens;
    }
    

assuming s can be very large, should I call s.size() only once and store the result?

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

回首观望 2024-09-17 04:24:05

在大多数情况下,您应该保留分配,除非您提前知道项目数量,以便可以保留正确的空间量。

至少在我所知的每种情况下,std::vector::size() 只是返回一个存储的值,因此它具有恒定的复杂性。理论上,C++ 标准允许它这样做。有理由允许其他一些容器(主要是 std::list ),而不是为这些容器制定特殊情况,他们只是建议所有容器使用恒定时间,而不是要求任何容器使用恒定时间。我无法想象一个能够计算元素数量的 vector::size ——我几乎不存在这样的东西。

PS,一种更简单的方法可以完成上面的代码,如下所示:

std::vector<string> split(std::string const &input) {
    vector<string> ret;
    istringstream buffer(input);

    copy(istream_iterator<string>(input),
         istream_iterator<string>(),
         back_inserter(ret));

    return ret;
}

编辑:IMO,The C++ Standard Library,作者:Nicolai Josuttis,是此类内容的绝佳参考。

In most cases, you should leave the allocation alone unless you know the number of items ahead of time, so you can reserve the correct amount of space.

At least in every case of which I'm aware, std::vector::size() just returns a stored value, so it has constant complexity. In theory, the C++ standard allows it to do otherwise. There are reasons to allow otherwise for some other containers, primarily std::list, and rather than make a special case for those, they simply recommend constant time for all containers instead of requiring it for any. I can't quite imagine a vector::size that counted elements though -- I'm pretty no such thing has ever existed.

P.S., an easier way to do what your code above does, is something like this:

std::vector<string> split(std::string const &input) {
    vector<string> ret;
    istringstream buffer(input);

    copy(istream_iterator<string>(input),
         istream_iterator<string>(),
         back_inserter(ret));

    return ret;
}

Edit: IMO, The C++ Standard Library, by Nicolai Josuttis is an excellent reference on such things.

内心荒芜 2024-09-17 04:24:05

容量增量的实际大小取决于实现,但它必须(大致)呈指数级才能支持容器的复杂性要求。例如,Visual C++ 标准库将准确分配前几个元素(如果我没记错的话,五个)所需的空间,然后以指数方式增加大小。

大小必须以某种方式存储在向量中,否则它不知道序列的末尾在哪里!然而,它不一定存储为整数。 Visual C++ 实现(再次作为示例)存储三个指针:

  1. 一个指向基础数组开头的指针、
  2. 一个指向序列当前结尾的指针以及
  3. 一个指向基础数组结尾的指针。

大小可以由(1)和(2)计算得出;容量可以根据 (1) 和 (3) 计算。

其他实现可能会以不同的方式存储信息。

The actual size of the capacity increment is implementation-dependent, but it has to be (roughly) exponential to support the container's complexity requirements. As an example, the Visual C++ standard library will allocate exactly the space required for the first few elements (five, if I recall correctly), then increases the size exponentially after that.

The size has to be stored somehow in the vector, otherwise it doesn't know where the end of the sequence is! However, it may not necessarily be stored as an integer. The Visual C++ implementation (again, as an example) stores three pointers:

  1. a pointer to the beginning of the underlying array,
  2. a pointer to the current end of the sequence, and
  3. a pointer to the end of the underlying array.

The size can be computed from (1) and (2); the capacity can be computed from (1) and (3).

Other implementations might store the information differently.

无所的.畏惧 2024-09-17 04:24:05
  1. 它是特定于库的。您也许能够控制增量分配,但也可能不能。

  2. 大小已存储,因此检索速度非常快(恒定时间)。不然还能怎样呢? C 通常无法知道内存位置是否是“真实数据”。

    由于

  1. It's library-specific. You might be able to control the incremental allocation, but you might not.

  2. The size is stored, so it is very fast (constant time) to retrieve. How else could it work? C has no way of knowing in general whether a memory location is "real data" or not.

如痴如狂 2024-09-17 04:24:05
  1. 调整大小机制通常是固定的。 (大多数编译器在达到限制时将向量的大小加倍。)C++ 标准没有指定控制此行为的方法。

  2. 每当您插入/删除元素时,大小都会在内部更新,并且当您调用 size() 时,它会立即返回。所以是的,它的时间复杂度为O(1)

  1. The resizing mechanism is usually fixed. (Most compilers double the size of the vector when it reaches the limit.) The C++ standard specifies no way to control this behaviour.

  2. The size is internally updated whenever you insert/remove elements and when you call size(), it's returned immediately. So yes, it's O(1).

黎歌 2024-09-17 04:24:05

与您的实际问题无关,但这是一种更“STL”的方式来完成您正在做的事情:

vector<string> split(const string& s)
{
    istringstream stream(s);
    istream_iterator<string> iter(stream), eos;
    vector<string> tokens;
    copy(iter, eos, back_inserter(tokens));
    return tokens;
}

Unrelated to your actual questions, but here's a more "STL" way of doing what you're doing:

vector<string> split(const string& s)
{
    istringstream stream(s);
    istream_iterator<string> iter(stream), eos;
    vector<string> tokens;
    copy(iter, eos, back_inserter(tokens));
    return tokens;
}
北方。的韩爷 2024-09-17 04:24:05

当元素数量达到此大小时,将调整向量的大小。这个大小编译器特定吗?我能控制它吗?这是个好主意吗?

一般来说,这是特定于库的行为,但是您可能可以通过指定自定义分配器来影响此行为,这是一项艰巨的工作。

重复调用 vector::size() 重新计算元素数量(O(n) 计算),或者该值是否存储在某处(O(1) 查找)。

大多数实现将大小存储为成员。这是单个内存读取。

When the number of elements reaches this size, the vector is resized. Is this size compiler specific? Can I control it? Is this a good idea?

In general, this is a library-specific behavior, but you may be able to influence this behavior by specifying a custom allocator, which is non-trivial work.

Do repeated calls to vector::size() recount the number of elements (O(n) calculation) or is this value stored somewhere (O(1) lookup).

Most implementations store the size as a member. It's a single memory read.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文