当前位置：文江博客话题详情

堆栈内存如何增加？

发布于 2024-09-09 03:11:16 字数 190 浏览 4 评论 0原文

在典型的 C 程序中，Linux 内核提供 84K - ~100K 的内存。当进程使用给定的内存时，内核如何为堆栈分配更多的内存。

IMO，当进程占用堆栈的所有内存并且现在使用下一个连续内存时，理想情况下它应该发生页面错误，然后内核处理页面错误。内核是否在这里为给定进程的堆栈提供了更多内存，Linux内核中的哪个数据结构标识了进程堆栈的大小？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

丘比特射中我 2024-09-16 03:11:16

根据操作系统（Linux 实时与普通）和底层语言运行时系统，可以使用多种不同的方法：

1）动态，通过页面错误

通常将一些实际页面预先分配到更高的地址，并将初始 sp 分配给该地址。栈向下增长，堆向上增长。如果页面错误发生在堆栈底部下方，则会分配并映射丢失的中间页面。有效地自动从顶部到底部增加堆栈。通常存在执行此类自动分配的最大值，该最大值可以或不能在环境 (ulimit)、exe 标头中指定，或由程序通过系统调用 (rlimit) 动态调整。尤其是这种可调整性在不同操作系统之间差异很大。通常还存在一个限制，即距离堆栈底部“多远”，页面错误被认为是正常的并且会发生自动增长。请注意，并非所有系统的堆栈都向下增长：在 HPUX 下，它（使用？）向上增长，因此我不确定 PA-Risc 上的 linux 会做什么（有人可以对此发表评论）。

2) 固定大小

其他操作系统（尤其是在嵌入式和移动环境中）要么根据定义具有固定大小，要么在 exe 标头中指定，或者在创建程序/线程时指定。特别是在嵌入式实时控制器中，这通常是一个配置参数，并且各个控制任务会获得修复堆栈（以避免失控线程占用更高优先级控制任务的内存）。当然，在这种情况下，内存也可能仅被虚拟分配，直到真正需要为止。

3) pagewise、spaghetti 和类似

的机制往往会被遗忘，但在一些运行时系统中仍在使用（我知道 Lisp/Scheme 和 Smalltalk 系统）。它们根据需要动态分配和增加堆栈。但是，不是作为单个连续段，而是作为多页块的链接链。它需要编译器生成不同的函数入口/出口代码，以便处理段边界。因此，此类方案通常由语言支持系统而不是操作系统本身来实现（过去很早 - 叹息）。原因是，当您在交互式环境中拥有许多（例如数千个）线程时，预分配例如 1Mb 只会填满您的虚拟地址空间，并且您无法支持以前未知单个线程的线程需求的系统（这是通常是在动态环境中的情况，其中用户可能会将评估代码输入到单独的工作区中）。因此，上面方案 1 中的动态分配是不可能的，因为会有其他线程拥有自己的堆栈。堆栈由较小的段（例如 8-64k）组成，这些段从池中分配和释放，并链接到堆栈段链中。对于诸如延续、协程等的高性能支持，也可能需要这样的方案。

现代 unix/linux 和（我猜，但不是 100% 确定）windows 使用方案 1) 作为 exe 的主线程，以及 2)对于额外的（p-）线程，它们需要线程创建者最初给出的固定堆栈大小。大多数嵌入式系统和控制器使用固定（但可配置）的预分配（在许多情况下甚至是物理预分配）。

编辑：错别字

There are a number of different methods used, depending on the OS (linux realtime vs. normal) and the language runtime system underneath:

1) dynamic, by page fault

typically preallocate a few real pages to higher addresses and assign the initial sp to that. The stack grows downward, the heap grows upward. If a page fault happens somewhat below the stack bottom, the missing intermediate pages are allocated and mapped. Effectively increasing the stack from the top towards the bottom automatically. There is typically a maximum up to which such automatic allocation is performed, which can or can not be specified in the environment (ulimit), exe-header, or dynamically adjusted by the program via a system call (rlimit). Especially this adjustability varies heavily between different OSes. There is also typically a limit to "how far away" from the stack bottom a page fault is considered to be ok and an automatic grow to happen. Notice that not all systems' stack grows downward: under HPUX it (used?) to grow upward so I am not sure what a linux on the PA-Risc does (can someone comment on this).

2) fixed size

other OSes (and especially in embedded and mobile environments) either have fixed sizes by definition, or specified in the exe header, or specified when a program/thread is created. Especially in embedded real time controllers, this is often a configuration parameter, and individual control tasks get fix stacks (to avoid runaway threads taking the memory of higher prio control tasks). Of course also in this case, the memory might be allocated only virtually, untill really needed.

3) pagewise, spaghetti and similar

such mechanisms tend to be forgotten, but are still in use in some run time systems (I know of Lisp/Scheme and Smalltalk systems). These allocate and increase the stack dynamically as-required. However, not as a single contigious segment, but instead as a linked chain of multi-page chunks. It requires different function entry/exit code to be generated by the compiler(s), in order to handle segment boundaries. Therefore such schemes are typically implemented by a language support system and not the OS itself (used to be earlier times - sigh). The reason is that when you have many (say 1000s of) threads in an interactive environment, preallocating say 1Mb would simply fill your virtual address space and you could not support a system where the thread needs of an individual thread is unknown before (which is typically the case in a dynamic environment, where the use might enter eval-code into a separate workspace). So dynamic allocation as in scheme 1 above is not possible, because there are would be other threads with their own stacks in the way. The stack is made up of smaller segments (say 8-64k) which are allocated and deallocated from a pool and linked into a chain of stack segments. Such a scheme may also be requried for high performance support of things like continuations, coroutines etc.

Modern unixes/linuxes and (I guess, but not 100% certain) windows use scheme 1) for the main thread of your exe, and 2) for additional (p-)threads, which need a fix stack size given by the thread creator initially. Most embedded systems and controllers use fixed (but configurable) preallocation (even physically preallocated in many cases).

edit: typo

回复收藏 0 原文