C++ 中正确的堆栈和堆使用?
我已经编程有一段时间了,但主要是 Java 和 C#。 我从来没有真正需要自己管理内存。 我最近开始用 C++ 编程,对于何时应该将内容存储在堆栈上以及何时应该将它们存储在堆上有点困惑。
我的理解是,经常访问的变量应该存储在栈上,对象、很少使用的变量、大型数据结构都应该存储在堆上。 这是正确的还是我不正确?
I've been programming for a while but It's been mostly Java and C#. I've never actually had to manage memory on my own. I recently began programming in C++ and I'm a little confused as to when I should store things on the stack and when to store them on the heap.
My understanding is that variables which are accessed very frequently should be stored on the stack and objects, rarely used variables, and large data structures should all be stored on the heap. Is this correct or am I incorrect?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
不,堆栈和堆之间的区别不是性能。 它的生命周期:函数内的任何局部变量(除 malloc() 或 new 之外的任何变量)都存在于堆栈中。 当您从函数返回时它就会消失。 如果您希望某些东西比声明它的函数寿命更长,则必须在堆上分配它。
为了更清楚地了解堆栈是什么,请从另一端开始 - 不要试图用高级语言来理解堆栈的作用,而是查找“调用堆栈”和“调用约定”并查看它们的作用当你调用一个函数时,机器确实会这样做。 计算机内存只是一系列地址; “堆”和“栈”是编译器的发明。
No, the difference between stack and heap isn't performance. It's lifespan: any local variable inside a function (anything you do not malloc() or new) lives on the stack. It goes away when you return from the function. If you want something to live longer than the function that declared it, you must allocate it on the heap.
For a clearer understanding of what the stack is, come at it from the other end -- rather than try to understand what the stack does in terms of a high level language, look up "call stack" and "calling convention" and see what the machine really does when you call a function. Computer memory is just a series of addresses; "heap" and "stack" are inventions of the compiler.
我会说:
如果可以的话,将其存储在堆栈上。
如果需要的话,将其存储在堆上。
因此,优先选择堆栈而不是堆。 无法在堆栈上存储内容的一些可能原因是:
使用明智的编译器分配非固定大小的 主要原因。堆上的对象(通常是编译时大小未知的数组)。
I would say:
Store it on the stack, if you CAN.
Store it on the heap, if you NEED TO.
Therefore, prefer the stack to the heap. Some possible reasons that you can't store something on the stack are:
It is possible, with sensible compilers, to allocate non-fixed size objects on the heap (usually arrays whose size is not known at compile time).
它比其他答案所暗示的更微妙。 根据声明方式,堆栈上的数据和堆上的数据之间没有绝对的划分。 例如:
在函数体中,在堆栈上声明一个由十个整数组成的向量(动态数组)。 但
vector
管理的存储不在堆栈上。啊,但是(其他答案表明)该存储的生命周期受到向量本身的生命周期的限制,这里它是基于堆栈的,所以它的实现方式没有区别 - 我们只能将其视为具有值语义的基于堆栈的对象。
并非如此。 假设该函数是:
因此,在保证单个所有者的系统下,任何具有交换函数(并且任何复杂值类型都应该有一个)的东西都可以充当对某些堆数据的一种可重新绑定引用。那个数据。
因此,现代 C++ 方法是从不将堆数据的地址存储在裸局部指针变量中。 所有堆分配都必须隐藏在类内。
如果这样做,您可以将程序中的所有变量视为简单值类型,并完全忘记堆(除非为某些堆数据编写新的类似值的包装类,这应该是不常见的) 。
您只需要保留一点特殊的知识来帮助您优化:在可能的情况下,不要像这样将一个变量分配给另一个变量:
像这样交换它们:
因为它更快并且不会抛出异常。 唯一的要求是您不需要
b
继续保持相同的值(它将获取a
的值,该值将被丢弃在>a = b
)。缺点是这种方法迫使您通过输出参数从函数返回值,而不是实际的返回值。 但他们在 C++0x 中使用 右值引用。
在最复杂的情况下,您可以将这种想法发挥到极致,并使用智能指针类,例如 tr1 中已经存在的
shared_ptr
。 (尽管我认为如果您似乎需要它,那么您可能已经超出了标准 C++ 的最佳适用范围。)It's more subtle than the other answers suggest. There is no absolute divide between data on the stack and data on the heap based on how you declare it. For example:
In the body of a function, that declares a
vector
(dynamic array) of ten integers on the stack. But the storage managed by thevector
is not on the stack.Ah, but (the other answers suggest) the lifetime of that storage is bounded by the lifetime of the
vector
itself, which here is stack-based, so it makes no difference how it's implemented - we can only treat it as a stack-based object with value semantics.Not so. Suppose the function was:
So anything with a
swap
function (and any complex value type should have one) can serve as a kind of rebindable reference to some heap data, under a system which guarantees a single owner of that data.Therefore the modern C++ approach is to never store the address of heap data in naked local pointer variables. All heap allocations must be hidden inside classes.
If you do that, you can think of all variables in your program as if they were simple value types, and forget about the heap altogether (except when writing a new value-like wrapper class for some heap data, which ought to be unusual).
You merely have to retain one special bit of knowledge to help you optimise: where possible, instead of assigning one variable to another like this:
swap them like this:
because it's much faster and it doesn't throw exceptions. The only requirement is that you don't need
b
to continue to hold the same value (it's going to geta
's value instead, which would be trashed ina = b
).The downside is that this approach forces you to return values from functions via output parameters instead of the actual return value. But they're fixing that in C++0x with rvalue references.
In the most complicated situations of all, you would take this idea to the general extreme and use a smart pointer class such as
shared_ptr
which is already in tr1. (Although I'd argue that if you seem to need it, you've possibly moved outside Standard C++'s sweet spot of applicability.)如果需要在创建它的函数范围之外使用某个项目,您还可以将其存储在堆上。 与堆栈对象一起使用的一种习惯用法称为 RAII - 这涉及使用基于堆栈的对象作为资源的包装器,当对象被销毁时,资源将被清理。 基于堆栈的对象更容易跟踪何时可能引发异常 - 您无需担心在异常处理程序中删除基于堆的对象。 这就是现代 C++ 中通常不使用原始指针的原因,您可以使用智能指针,它可以是基于堆栈的包装器,用于指向基于堆的对象的原始指针。
You also would store an item on the heap if it needs to be used outside the scope of the function in which it is created. One idiom used with stack objects is called RAII - this involves using the stack based object as a wrapper for a resource, when the object is destroyed, the resource would be cleaned up. Stack based objects are easier to keep track of when you might be throwing exceptions - you don't need to concern yourself with deleting a heap based object in an exception handler. This is why raw pointers are not normally used in modern C++, you would use a smart pointer which can be a stack based wrapper for a raw pointer to a heap based object.
为了补充其他答案,它也可能与性能有关,至少有一点。 并不是说您应该担心这一点,除非它与您相关,但是:
在堆中分配需要找到跟踪内存块,这不是恒定时间操作(并且需要一些周期和开销)。 当内存变得碎片化和/或您接近使用 100% 的地址空间时,速度可能会变慢。 另一方面,堆栈分配是恒定时间的,基本上是“自由”操作。
另一件需要考虑的事情(同样,只有当它成为问题时才重要)是,通常堆栈大小是固定的,并且可能比堆大小小得多。 因此,如果您要分配大对象或许多小对象,您可能需要使用堆; 如果堆栈空间不足,运行时将抛出站点名义异常。 通常没什么大不了的,但需要考虑另一件事。
To add to the other answers, it can also be about performance, at least a little bit. Not that you should worry about this unless it's relevant for you, but:
Allocating in the heap requires finding a tracking a block of memory, which is not a constant-time operation (and takes some cycles and overhead). This can get slower as memory becomes fragmented, and/or you're getting close to using 100% of your address space. On the other hand, stack allocations are constant-time, basically "free" operations.
Another thing to consider (again, really only important if it becomes an issue) is that typically the stack size is fixed, and can be much lower than the heap size. So if you're allocating large objects or many small objects, you probably want to use the heap; if you run out of stack space, the runtime will throw the site titular exception. Not usually a big deal, but another thing to consider.
堆栈更高效,并且更容易管理作用域数据。
但是堆应该用于大于 几 KB 的任何内容(在 C++ 中很容易,只需在堆栈上创建一个
boost::scoped_ptr
即可保存指向已分配内存的指针)。考虑一个不断调用自身的递归算法。 很难限制和/或猜测总堆栈使用量! 而在堆上,分配器(
malloc()
或new
)可以通过返回NULL
或throw 来指示内存不足
ing。来源:堆栈不大于8KB的Linux内核!
Stack is more efficient, and easier to managed scoped data.
But heap should be used for anything larger than a few KB (it's easy in C++, just create a
boost::scoped_ptr
on the stack to hold a pointer to the allocated memory).Consider a recursive algorithm that keeps calling into itself. It's Very hard to limit and or guess the total stack usage! Whereas on the heap, the allocator (
malloc()
ornew
) can indicate out-of-memory by returningNULL
orthrow
ing.Source: Linux Kernel whose stack is no larger than 8KB!
为了完整起见,您可以阅读 Miro Samek 的文章,了解在嵌入式软件中使用堆的问题。
一堆问题
For completeness, you may read Miro Samek's article about the problems of using the heap in the context of embedded software.
A Heap of Problems
您可以选择在堆上还是在堆栈上分配,具体取决于变量的分配方式。 如果您使用“new”调用动态分配某些内容,则您将从堆中进行分配。 如果将某些内容分配为全局变量或函数中的参数,则它将分配在堆栈上。
The choice of whether to allocate on the heap or on the stack is one that is made for you, depending on how your variable is allocated. If you allocate something dynamically, using a "new" call, you are allocating from the heap. If you allocate something as a global variable, or as a parameter in a function it is allocated on the stack.
在我看来,在大多数情况下,有两个决定因素
我更喜欢使用堆栈,但如果您需要访问范围之外的变量,则可以使用堆。
为了在使用堆时提高性能,您还可以使用创建堆块的功能,这有助于提高性能,而不是将每个变量分配在不同的内存位置。
In my opinion there are two deciding factors
I would prefer to use stack in most cases but if you need access to variable outside scope you can use heap.
To enhance performance while using heaps you can also use the functionality to create heap block and that can help in gaining performance rather than allocating each variable in different memory location.
也许这已经得到了很好的回答。 我想向您推荐以下系列文章,以便更深入地了解底层细节。 Alex Darby 有一系列文章,他将引导您使用调试器。 这是关于堆栈的第 3 部分。
http://www.altdevblogaday.com/2011/12/14/cc-low-level-curriculum-part-3-the-stack/" altdevblogaday.com/2011/12/14/cc-low-level-curriculum-part-3-the-stack/
probably this has been answered quite well. I would like to point you to the below series of articles to have a deeper understanding of low level details. Alex Darby has a series of articles, where he walks you through with a debugger. Here is Part 3 about the Stack.
http://www.altdevblogaday.com/2011/12/14/c-c-low-level-curriculum-part-3-the-stack/