VisualStudio2008 上的 std::vector 似乎未实现最佳实现 - 复制构造函数调用过多

发布于 2024-07-14 21:54:01 字数 1523 浏览 4 评论 0原文

我一直在将流行的 XmlRpc 库的 STL 实现与主要避免 STL 的实现进行比较。 STL 实现要慢得多 - 我从 47 秒降至 4.5 秒。我已经诊断出一些原因：部分原因是 std::string 被误用（例如，作者应该尽可能使用“const std::string&” - 不要仅仅使用 std::string 就好像它们是 Java 字符串），但这也是因为每次向量超出其边界时都会不断地调用复制构造函数，这种情况非常频繁。复制构造函数非常慢，因为它们对树（XmlRpc 值）进行深层复制。

StackOverflow 上的其他人告诉我，std::vector 实现通常会在每次超出缓冲区大小时将其大小加倍。 VisualStudio 2008 上的情况似乎并非如此：向 std::vector 添加 50 个项目需要调用 177 次复制构造函数。每次加倍应该调用复制构造函数 64 次。如果您非常关心保持较低的内存使用量，那么每次增加 50% 就应该调用复制构造函数 121 次。那么177从哪里来呢？

我的问题是：(a) 为什么复制构造函数被如此频繁地调用？ (b) 如果您只是将对象从一个位置移动到另一个位置，是否有任何方法可以避免使用复制构造函数？（在这种情况下，实际上大多数情况下， memcpy() 就足够了 - 这会产生很大的差异）。

（注意：我了解 vector::reserve()，我只是有点失望，因为应用程序程序员需要实现加倍技巧，因为这样的东西已经是任何好的 STL 实现的一部分。）

我的测试程序：

#include <string>
#include <iostream>
#include <vector>



using namespace std;


int constructorCalls;
int assignmentCalls;
int copyCalls;


class C {
    int n;

public:
    C(int _n) { n = _n; constructorCalls++; }
    C(const C& orig) { copyCalls++; n = orig.n; }
    void operator=(const C &orig) { assignmentCalls++; n = orig.n; }
};



int main(int argc, char* argv[])
{
    std::vector<C> A;

    //A.reserve(50);
    for (int i=0; i < 50; i++)
        A.push_back(i);
    cout << "constructor calls = " << constructorCalls << "\n";
    cout << "assignment calls = " << assignmentCalls << "\n";
    cout << "copy calls = " << copyCalls << "\n";
    return 0;
}

原文

I've been comparing a STL implementation of a popular XmlRpc library with an implementation that mostly avoids STL. The STL implementation is much slower - I got 47s down to 4.5s. I've diagnosed some of the reasons: it's partly due to std::string being mis-used (e.g. the author should have used "const std::string&" wherever possible - don't just use std::string's as if they were Java strings), but it's also because copy constructors were being constantly called each time the vector outgrew its bounds, which was exceedingly often. The copy constructors were very slow because they did deep-copies of trees (of XmlRpc values).

I was told by someone else on StackOverflow that std::vector implementations typically double the size of the buffer each time they outgrow. This does not seem to be the case on VisualStudio 2008: to add 50 items to a std::vector took 177 calls of the copy constructor. Doubling each time should call the copy constructor 64 times. If you were very concerned about keeping memory usage low, then increasing by 50% each time should call the copy constructor 121 times. So where does the 177 come from?

My question is: (a) why is the copy constructor called so often? (b) is there any way to avoid using the copy constructor if you're just moving an object from one location to another? (In this case and indeed most cases a memcpy() would have sufficed - and this makes a BIG difference).

(NB: I know about vector::reserve(), I'm just a bit disappointed that application programmers would need to implement the doubling trick when something like this is already part of any good STL implementation.)

My test program:

#include <string>
#include <iostream>
#include <vector>



using namespace std;


int constructorCalls;
int assignmentCalls;
int copyCalls;


class C {
    int n;

public:
    C(int _n) { n = _n; constructorCalls++; }
    C(const C& orig) { copyCalls++; n = orig.n; }
    void operator=(const C &orig) { assignmentCalls++; n = orig.n; }
};



int main(int argc, char* argv[])
{
    std::vector<C> A;

    //A.reserve(50);
    for (int i=0; i < 50; i++)
        A.push_back(i);
    cout << "constructor calls = " << constructorCalls << "\n";
    cout << "assignment calls = " << assignmentCalls << "\n";
    cout << "copy calls = " << copyCalls << "\n";
    return 0;
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

妖妓 2024-07-21 21:54:01

不要忘记计算将临时 C 对象 push_back 放入向量所需的复制构造函数调用次数。每次迭代将至少调用C 的复制构造函数一次。

如果添加更多打印代码，发生的情况会更清楚：

std::vector<C> A;
std::vector<C>::size_type prevCapacity = A.capacity();

for (int i=0; i < 50; i++) {
    A.push_back(i);
    if(prevCapacity != A.capacity()) {
       cout << "capacity " << prevCapacity << " -> " << A.capacity() << "\n";
    }
    prevCapacity = A.capacity();
}

输出如下：

capacity 0 -> 1
capacity 1 -> 2
capacity 2 -> 3
capacity 3 -> 4
capacity 4 -> 6
capacity 6 -> 9
capacity 9 -> 13
capacity 13 -> 19
capacity 19 -> 28
capacity 28 -> 42
capacity 42 -> 63

所以是的，容量每次增加 50%，这占了 127 份：

1 + 2 + 3 + 4 + 6 + 9 + 13 + 19 + 28 + 42 = 127

从 50 份中添加 50 份额外的份数调用 push_back 并且您有 177：

127 + 50 = 177

Don't forget to count the copy constructor calls needed to push_back a temporary C object into the vector. Each iteration will call C's copy constructor at least once.

If you add more printing code, it's a bit clearer what is going on:

std::vector<C> A;
std::vector<C>::size_type prevCapacity = A.capacity();

for (int i=0; i < 50; i++) {
    A.push_back(i);
    if(prevCapacity != A.capacity()) {
       cout << "capacity " << prevCapacity << " -> " << A.capacity() << "\n";
    }
    prevCapacity = A.capacity();
}

This has the following output:

capacity 0 -> 1
capacity 1 -> 2
capacity 2 -> 3
capacity 3 -> 4
capacity 4 -> 6
capacity 6 -> 9
capacity 9 -> 13
capacity 13 -> 19
capacity 19 -> 28
capacity 28 -> 42
capacity 42 -> 63

So yes, the capacity increases by 50% each time, and this accounts for 127 of the copies:

1 + 2 + 3 + 4 + 6 + 9 + 13 + 19 + 28 + 42 = 127

Add the 50 additional copies from 50 calls to push_back and you have 177:

127 + 50 = 177

回复收藏 0 原文

鱼忆七猫命九 2024-07-21 21:54:01

我的问题是：
(a) 为什么复制构造函数被如此频繁地调用？

因为当调整向量大小时，您需要将旧缓冲区中的所有元素复制到新缓冲区中。这是因为向量保证对象存储在连续的内存位置中。

(b) 如果您只是移动，有什么方法可以避免使用复制构造函数
一个物体从一个位置到另一个位置？

不，没有办法避免使用复制构造函数。
这是因为该对象有多个需要正确初始化的成员。
如果您使用memcpy，您如何知道该对象已正确初始化为该对象！

例如。如果对象包含智能指针。你不能只 memcpy 一个智能指针。它需要做额外的工作来跟踪所有权。否则，当原始对象超出范围时，内存将被删除，并且新对象将具有悬空指针。同样的原则适用于所有具有构造函数（复制构造函数）的对象，构造函数实际上执行所需的工作。

停止复制内容的方法就是预留空间。
这使得向量为其将存储的所有对象分配足够的空间。因此它不需要不断重新分配主缓冲区。它只是将对象复制到向量中。

每次加倍应该调用复制构造函数 64 次。
如果您非常关心保持较低的内存使用量，那么增加
50%每次应该调用复制构造函数121次。
那么177从哪里来呢？

向量分配的大小 = 1:
添加元素 1：（不重新分配）但将元素 1 复制到向量中。
添加元素 2：重新分配缓冲区（大小 2）：复制元素 1。将元素 2 复制到向量中。
添加元素 3：重新分配缓冲区（大小 4）：复制元素 1-2。将元素 3 复制到向量中。
添加元素 4：将元素 4 复制到向量中
添加元素 5：重新分配缓冲区（大小 8）：复制元素 1-4。将元素 5 复制到向量中。
添加元素 6：将元素 6 复制到向量中
添加元素 7：将元素 7 复制到向量中
添加元素 8：将元素 8 复制到向量中
添加元素 9：重新分配缓冲区（大小 16）：复制元素 1-8。将元素 9 复制到向量中。
添加元素 10：将元素 10 复制到向量
等等。

前 10 个元素需要 25 个复制结构。
如果您先使用保留，则只需构建 10 次副本即可。

My question is:
(a) why is the copy constructor called so often?

Because when the vector is re-sized you need to copy all the elements from the old buffer into the new buffer. This is because the vector guarantees that the objects are stored in consecutive memory locations.

(b) is there any way to avoid using the copy constructor if you're just moving
an object from one location to another?

No there is no way to avoid the use of the copy constructor.
This because the object has several members that need to be initialized correctly.
If you used memcpy how do you know the object has been initialized correctly for the object!

For example. IF the object contained a smart pointer. You can't just memcpy a smart pointer. It needs to do extra work to track ownership. Otherwise when the original goes out of scope the memory is deleted and the new object has a dangling pointer. The same principle applies to all objects that have a constructor (copy constructor) the constructor actually does required work.

The way to stop the copy of the content is too reserve the space.
This makes vector allocate enough space for all the objects it will store. Thus it does not need to keep reallocating the main buffer. It just copies the objects into the vector.

Doubling each time should call the copy constructor 64 times.
If you were very concerned about keeping memory usage low, then increasing by
50% each time should call the copy constructor 121 times.
So where does the 177 come from?

Vector allocated size = 1:
Add element 1: (no reallocation) But copies element 1 into vector.
Add element 2: Reallocate buffer (size 2): Copy element 1 across. Copy element 2 into vector.
Add element 3: Reallocate buffer (size 4): Copy element 1-2 across. Copy element 3 into vector.
Add element 4: Copy element 4 into vector
Add element 5: Reallocate buffer (size 8): Copy element 1-4 across. Copy element 5 into vector.
Add element 6: Copy element 6 into vector
Add element 7: Copy element 7 into vector
Add element 8: Copy element 8 into vector
Add element 9: Reallocate buffer (size 16): Copy element 1-8 across. Copy element 9 into vector.
Add element 10: Copy element 10 into vector
etc.

First 10 elements took 25 copy constructions.
If you had used reserve first it would have only taken 10 copy constructions.

回复收藏 0 原文