优化向量中不必要的字符串复制;
提供最少的代码来描述问题:
struct A {
vector<string> v;
// ... other data and methods
};
A obj;
ifstream file("some_file.txt");
char buffer[BIG_SIZE];
while( <big loop> ) {
file.getline(buffer, BIG_SIZE-1);
// process buffer; which may change its size
obj.v.push_back(buffer); // <------- can be optimized ??
}
...
这里发生2次字符串
创建; 第一次创建实际的 string
对象,第二次为 vector
复制构建它。 演示
push_back()
操作发生数百万次< /strong> 而且我多次支付一笔额外的分配费用,这对我来说毫无用处。
有没有办法优化这个?我愿意接受任何合适的改变。 (不要将其归类为过早优化,因为 push_back()
在整个代码中发生了很多次)。
Presenting the minimal code to describe the problem:
struct A {
vector<string> v;
// ... other data and methods
};
A obj;
ifstream file("some_file.txt");
char buffer[BIG_SIZE];
while( <big loop> ) {
file.getline(buffer, BIG_SIZE-1);
// process buffer; which may change its size
obj.v.push_back(buffer); // <------- can be optimized ??
}
...
Here 2 times string
creation happens; 1st time to create the actual string
object and 2nd time while copy constructing it for the vector
. Demo
The push_back()
operation happens millions of times and I am paying for one extra allocation those many times which is of no use for me.
Is there a way to optimize this ? I am open for any suitable change. (not categorizing this as premature optimization because push_back()
happens so many times throughout the code).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
好吧,你得到了两个分配,但不是两个都是字符串的:其中一个创建字符串,而另一个仅在向量内部创建一个指针(请注意,这取决于编译器:某些编译器/设置可能确实创建两个字符串,但大多数不会)。查看演示的此代码。
优化它的一种方法是使用 char* 而不是字符串作为模板参数(不要忘记在杀死向量之前手动删除它!)。这样你就可以摆脱一个(最大的)分配。或者,只需使用您自己的向量实现:您就可以控制内存分配的各个方面。
Well, you get two allocations, but not both of them are of the string: one of them creates the string, while the other creates just a pointer inside of the vector (note that this depends on the compiler: some compilers/settings might indeed create two strings, but most won't). Look at this code for the demo.
One way to optimize it would be using the char* instead of the string as the template parameter (don't forget to manually delete it before killing the vector!). This way you'll get rid of one (biggest) of the allocations. Alternatively, just use your own implementation of vector: you'll be able to control every aspect of memory allocation then.
你可以尝试一些事情。第一个显然是启用编译器优化。
如果您可以将其声明为向量
,这可能会有所帮助。否则您可以尝试以下操作:
You can try a couple of things. The first is obviously to enable optimization on the compiler.
If you can declare it as avector<const string>
that may help.Otherwise you might try something like:
不要将缓冲区放在堆栈上,而是将其放在堆上。然后使用指针向量。只有一个
Instead of having buffer on the stack - put it onto the heap. Then use a vector of pointers. Only one