插入映射时的内存分配

发布于 2024-08-24 05:27:52 字数 958 浏览 4 评论 0原文

#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
#include <vector>
#include <string>
#include <iostream>
#include <map>
#include <utility>
#include <algorithm>

void * GetMemory(size_t n) {
  void *ptr = malloc(n);
  printf("getMem n %d   ptr 0x%x\n", n, reinterpret_cast<unsigned int> (ptr));
  return ptr;
}

void FreeMemory(void *p) {
  free(p);
}

void* operator new (size_t n) {
  void *p = GetMemory(n);
  return p;
}

void* operator new [] (size_t n) {
  void *p = GetMemory(n);
  return p;
}

void operator delete (void *p) {
  FreeMemory(p);
}

void operator delete [] (void *p) {
  FreeMemory(p);
}

typedef std::vector<int> vec;

int main(int argc, char *argv[]) {
  std::map<int, vec> z;
  vec x;
  z.insert(std::pair<int,vec>(1,x));
}

编译用 g++ -Wall -ansi test.cpp -o test

运行测试。

为什么有 3 次调用 GetMemory 并且 n = 0？

原文

#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
#include <vector>
#include <string>
#include <iostream>
#include <map>
#include <utility>
#include <algorithm>

void * GetMemory(size_t n) {
  void *ptr = malloc(n);
  printf("getMem n %d   ptr 0x%x\n", n, reinterpret_cast<unsigned int> (ptr));
  return ptr;
}

void FreeMemory(void *p) {
  free(p);
}

void* operator new (size_t n) {
  void *p = GetMemory(n);
  return p;
}

void* operator new [] (size_t n) {
  void *p = GetMemory(n);
  return p;
}

void operator delete (void *p) {
  FreeMemory(p);
}

void operator delete [] (void *p) {
  FreeMemory(p);
}

typedef std::vector<int> vec;

int main(int argc, char *argv[]) {
  std::map<int, vec> z;
  vec x;
  z.insert(std::pair<int,vec>(1,x));
}

Compile with
g++ -Wall -ansi test.cpp -o test

Run test.

Why are there three calls to GetMemory with n = 0?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

·深蓝 2024-08-31 05:27:52

在 FreeMemory 中进行一些跟踪，并将 main 更改为：

int main(int argc, char *argv[]) {
  printf("map\n");
  std::map<int, vec> z;
  printf("vec\n");
  vec x;
  printf("pair\n");
  std::pair<int,vec> y(1,x);
  printf("insert\n");
  z.insert(y);
  printf("inserted 1\n");
  y.first = 2;
  printf("insert\n");
  z.insert(y);
  printf("inserted 2\n");

}

输出：

$ make mapinsert CXXFLAGS=-O3 -B && ./mapinsert
g++ -O3    mapinsert.cpp   -o mapinsert
map
vec
pair
getMem n 0   ptr 0x6b0258
insert
getMem n 0   ptr 0x6b0268
getMem n 32   ptr 0x6b0278
getMem n 0   ptr 0x6b02a0
FreeMemory ptr 0x6b0268
inserted 1
insert
getMem n 0   ptr 0x6b0268
getMem n 32   ptr 0x6b02b0
getMem n 0   ptr 0x6b02d8
FreeMemory ptr 0x6b0268
inserted 2
FreeMemory ptr 0x6b0258
FreeMemory ptr 0x6b02d8
FreeMemory ptr 0x6b02b0
FreeMemory ptr 0x6b02a0
FreeMemory ptr 0x6b0278

3 个 0 大小的分配中：

一个是将空向量复制到该对中。
一种是在地图中存储空向量的副本。

这两个显然是必要的。我不确定的是：

一种是在对 insert 的调用中将向量复制到某处，并且在对 insert 的调用中也将其释放。

就好像 insert （或它内部调用的东西）通过值而不是通过引用获取其参数，或者 insert 在某个时间之前显式地将副本放入自动变量中它分配新的地图节点。目前启动调试器对我来说很困难，我将把它留给其他人。

编辑：谜团解开了。 insert 采用 std::pair，而不是 std::pair。空向量的额外副本是因为您构造的对必须转换为（另一个）临时变量，然后将该临时变量的引用传递给 insert。 std::pair 有一个构造函数模板，可以让您摆脱几乎任何事情。 20.2.2/4：

template<class U, class V> pair(const pair<U,V> &p);

效果：初始化成员
参数的相应成员，
执行隐式转换为
需要。

我还观察到，在我的实现中，vec x; 不会调用 getMem，但 vec x(0); 会调用。实际上：

z[1] = vec();

是更少的代码并且剥夺了您制作额外副本的机会（尽管它改为调用 operator= ）。它仍然会进行 2 个 0 大小的分配，至少对我来说是这样。

C++ 标准定义 operator[] 来返回涉及调用 insert 的指定表达式的结果。我不确定这是否意味着 operator[] 的效果“就像”调用了 make_pair 和 insert （即标准与指定运算符[] 的源必须是什么一样好），或者只是返回的值与指定表达式产生的值相同。如果是后者，那么也许一个实现可以通过单个 0 大小的分配来完成。但是，如果不首先创建包含映射类型的对，map 就无法保证创建条目，因此应该预期会分配 2 次。或者更准确地说，是所需映射值的 2 个副本：复制 0 大小的向量会导致 0 大小的分配，这一事实取决于实现。

因此，如果您遇到的情况是，该值的复制成本确实很高，但默认构造的成本却很低（例如包含大量元素的容器），那么以下内容可能会很有用：

std::map<int, vec> z;
vec x(1000);
z[1] = x;
// i.e. (*(z.insert(std::pair<const int, vec>(1,vec())).first)).second = x;

进行 2 次大小为 4000 的分配和 2 次大小为 4000 的分配0，而：

std::map<int, vec> z;
vec x(1000);
z.insert(std::pair<const int, vec>(2, x));

使 3 个大小为 4000，且没有一个大小为 0。最终，大小足够大，第一个代码中的额外分配比第二个代码中的额外复制更便宜。

我不确定 C++0x 中的移动构造函数可能会对此有所帮助。

Stick some tracing in FreeMemory and change main to this:

int main(int argc, char *argv[]) {
  printf("map\n");
  std::map<int, vec> z;
  printf("vec\n");
  vec x;
  printf("pair\n");
  std::pair<int,vec> y(1,x);
  printf("insert\n");
  z.insert(y);
  printf("inserted 1\n");
  y.first = 2;
  printf("insert\n");
  z.insert(y);
  printf("inserted 2\n");

}

Output:

$ make mapinsert CXXFLAGS=-O3 -B && ./mapinsert
g++ -O3    mapinsert.cpp   -o mapinsert
map
vec
pair
getMem n 0   ptr 0x6b0258
insert
getMem n 0   ptr 0x6b0268
getMem n 32   ptr 0x6b0278
getMem n 0   ptr 0x6b02a0
FreeMemory ptr 0x6b0268
inserted 1
insert
getMem n 0   ptr 0x6b0268
getMem n 32   ptr 0x6b02b0
getMem n 0   ptr 0x6b02d8
FreeMemory ptr 0x6b0268
inserted 2
FreeMemory ptr 0x6b0258
FreeMemory ptr 0x6b02d8
FreeMemory ptr 0x6b02b0
FreeMemory ptr 0x6b02a0
FreeMemory ptr 0x6b0278

So of your 3 0-sized allocations:

One is to copy the empty vector into the pair.
One is to store a copy of the empty vector in the map.

These two are clearly necessary. What I'm not sure about is this:

One is to copy the vector somewhere in the call to insert, and this is also freed in the call to insert.

It's as if insert (or something it calls internally) is taking its parameter by value instead of by reference, or insert is explicitly taking a copy into an automatic variable some time before it allocates the new map node. Firing up a debugger is effort for me at the moment, I'll leave it to someone else.

Edit: mystery solved. insert takes a std::pair<const int, vec>, not a std::pair<int, vec>. The extra copy of an empty vector is because the pair you construct has to be converted to a(nother) temporary, then a reference to that temporary is passed to insert. std::pair has a constructor template that lets you get away with almost anything. 20.2.2/4:

template<class U, class V> pair(const pair<U,V> &p);

Effects: initializes members from the
corresponding members of the argument,
performing implicit conversions as
needed.

I also observe that in my implementation, vec x; doesn't call getMem, but vec x(0); does. So actually:

z[1] = vec();

Is less code and denies you the opportunity to make the extra copy (although it calls operator= instead). It does still make 2 0-sized allocations, at least for me.

The C++ standard defines operator[] to return the result of a specified expression involving a call to insert. I'm not certain whether this means the effects of operator[] are "as if" make_pair and insert were called (that is, the standard is as good as specifying what the source must be for operator[]), or just that the value returned is the same value as the specified expression would yield. If the latter then perhaps an implementation could do it with a single 0-sized allocation. But certainly map has no guaranteed way to create an entry without first creating a pair that contains the mapped type, so 2 allocations should be expected. Or more properly, 2 copies of the desired mapped value: the fact that copying a 0-sized vector makes a 0-sized allocation is implementation-dependent.

So, if you had a case where the value was really expensive to copy, but really cheap to default-construct (like a container with lots of elements), then the following might be useful:

std::map<int, vec> z;
vec x(1000);
z[1] = x;
// i.e. (*(z.insert(std::pair<const int, vec>(1,vec())).first)).second = x;

makes 2 allocations of size 4000 and 2 of size 0, whereas:

std::map<int, vec> z;
vec x(1000);
z.insert(std::pair<const int, vec>(2, x));

makes 3 of size 4000 and none of size 0. Eventually the size is big enough that the extra allocation in the first code is cheaper than the extra copying in the second code.

It's possible that move-constructors in C++0x will help with this, I'm not sure.

回复收藏 0 原文