内核开发中使用的堆栈大小
我正在开发一个操作系统,而不是对内核进行编程,我正在设计内核。 该操作系统针对的是 x86 架构,而我的目标是现代计算机。 预计所需 RAM 数量为 256Mb 或更多。
系统上每个线程运行的堆栈大小最好是多少? 我是否应该尝试以这样的方式设计系统:如果达到最大长度,堆栈可以自动扩展?
我想,如果我没记错的话,RAM 中的一个页面是 4k 或 4096 字节,这对我来说似乎并不多。 我确实可以看到,特别是在使用大量递归时,我希望 RAM 中同时拥有超过 1000 个整数。 现在,真正的解决方案是让程序通过使用 malloc 来执行此操作并管理自己的内存资源,但实际上我想知道用户对此的看法。
4k 足够大来容纳现代计算机程序吗? 堆栈应该比这个大吗? 堆栈是否应该自动扩展以适应任何类型的大小? 从实际开发人员的角度和安全的角度来看,我对此都很感兴趣。
4k 对于堆栈来说太大了吗? 考虑到正常的程序执行,特别是从 C++ 中的类的角度来看,我注意到好的源代码往往会在创建类时使用 malloc/new
所需的数据,以最大程度地减少乱扔的数据在函数调用中。
我什至没有了解处理器缓存的大小。 理想情况下,我认为堆栈将驻留在缓存中以加快速度,但我不确定是否需要实现这一点,或者处理器是否可以为我处理它。 我只是打算使用普通无聊的旧内存进行测试。 我无法决定。 有什么选择?
I'm developing an operating system and rather than programming the kernel, I'm designing the kernel. This operating system is targeted at the x86 architecture and my target is for modern computers. The estimated number of required RAM is 256Mb or more.
What is a good size to make the stack for each thread run on the system? Should I try to design the system in such a way that the stack can be extended automatically if the maximum length is reached?
I think if I remember correctly that a page in RAM is 4k or 4096 bytes and that just doesn't seem like a lot to me. I can definitely see times, especially when using lots of recursion, that I would want to have more than 1000 integars in RAM at once. Now, the real solution would be to have the program doing this by using malloc
and manage its own memory resources, but really I would like to know the user opinion on this.
Is 4k big enough for a stack with modern computer programs? Should the stack be bigger than that? Should the stack be auto-expanding to accommodate any types of sizes? I'm interested in this both from a practical developer's standpoint and a security standpoint.
Is 4k too big for a stack? Considering normal program execution, especially from the point of view of classes in C++, I notice that good source code tends to malloc/new
the data it needs when classes are created, to minimize the data being thrown around in a function call.
What I haven't even gotten into is the size of the processor's cache memory. Ideally, I think the stack would reside in the cache to speed things up and I'm not sure if I need to achieve this, or if the processor can handle it for me. I was just planning on using regular boring old RAM for testing purposes. I can't decide. What are the options?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
堆栈大小取决于线程正在执行的操作。 我的建议:
顺便说一句,x86 页面大小通常为 4k,但并非必须如此。 您可以选择 64k 尺寸甚至更大。 使用较大页面的通常原因是为了避免 TLB 未命中。 同样,我会将其设为内核配置或运行时参数。
Stack size depends on what your threads are doing. My advice:
By the way, x86 page sizes are generally 4k, but they do not have to be. You can go with a 64k size or even larger. The usual reason for larger pages is to avoid TLB misses. Again, I would make it a kernel configuration or run-time parameter.
在linux内核源代码中搜索KERNEL_STACK_SIZE,您会发现它非常依赖于体系结构 - PAGE_SIZE或2 * PAGE_SIZE等(下面只是一些结果 - 许多中间输出被删除)。
Search for KERNEL_STACK_SIZE in linux kernel source code and you will find that it is very much architecture dependent - PAGE_SIZE, or 2*PAGE_SIZE etc (below is just some results - many intermediate output are deleted).
我将投入两分钱来让球滚动:
我不确定“典型”堆栈大小是多少。 我猜想每个线程可能是 8 KB,如果线程超过这个数量,就抛出异常。 但是,根据 此,Windows 有一个每个线程默认保留堆栈大小为 1MB,但不会一次性全部提交(页面在需要时提交)。 此外,您可以在编译时使用编译器指令为给定的 EXE 请求不同的堆栈大小。 不确定 Linux 是做什么的,但我看到了对 4 KB 堆栈的引用(尽管我认为这可以在编译内核时更改,并且我不确定默认堆栈大小是多少......)
这与与第一点一致。 您可能希望对每个线程可以获得的堆栈量有一个固定的限制。 程序将耗尽所有可用内存。
I'll throw my two cents in to get the ball rolling:
I'm not sure what a "typical" stack size would be. I would guess maybe 8 KB per thread, and if a thread exceeds this amount, just throw an exception. However, according to this, Windows has a default reserved stack size of 1MB per thread, but it isn't committed all at once (pages are committed as they are needed). Additionally, you can request a different stack size for a given EXE at compile-time with a compiler directive. Not sure what Linux does, but I've seen references to 4 KB stacks (although I think this can be changed when you compile the kernel and I'm not sure what the default stack size is...)
This ties in with the first point. You probably want a fixed limit on how much stack each thread can get. Thus, you probably don't want to automatically allocate more stack space every time a thread exceeds its current stack space, because a buggy program that gets stuck in an infinite recursion is going to eat up all available memory.
如果您使用虚拟内存,您确实希望使堆栈可增长。 强制静态分配堆栈大小,就像 Qthreads 和 Windows Fibers 等用户级线程中常见的那样,是一团糟。 使用困难,容易崩溃。 所有现代操作系统都会动态地增长堆栈,我认为通常是通过在当前堆栈指针下方设置一个或两个写保护保护页。 在那里写入,然后告诉操作系统堆栈已低于其分配的空间,并且您在其下方分配一个新的保护页面并使命中的页面可写。 只要没有单个函数分配超过一页的数据,就可以正常工作。 或者您可以使用两个或四个保护页来允许更大的堆栈帧。
如果您想要一种控制堆栈大小的方法,并且您的目标是一个真正受控且高效的环境,但不关心与 Linux 等相同风格的编程,请选择每次启动一个任务的单次执行模型检测到相关事件,运行直至完成,然后将任何持久数据存储在其任务数据结构中。 这样,所有线程就可以共享一个堆栈。 用于许多用于汽车控制等的超薄实时操作系统。
If you are using virtual memory, you do want to make the stack growable. Forcing static allocation of stack sized, like is common in user-level threading like Qthreads and Windows Fibers is a mess. Hard to use, easy to crash. All modern OSes do grow the stack dynamically, I think usually by having a write-protected guard page or two below the current stack pointer. Writes there then tell the OS that the stack has stepped below its allocated space, and you allocate a new guard page below that and make the page that got hit writable. As long as no single function allocates more than a page of data, this works fine. Or you can use two or four guard pages to allow larger stack frames.
If you want a way to control stack size and your goal is a really controlled and efficient environment, but do not care about programming in the same style as Linux etc., go for a single-shot execution model where a task is started each time a relevant event is detected, runs to completion, and then stores any persistent data in its task data structure. In this way, all threads can share a single stack. Used in many slim real-time operating systems for automotive control and similar.
为什么不将堆栈大小设置为可配置项,或者与程序一起存储,或者在一个进程创建另一个进程时指定?
您可以通过多种方式进行配置。
有一个准则规定“0、1 或 n”,这意味着您应该允许零个、一个或任何数量(受内存等其他约束限制)的对象 - 这也适用于对象的大小。
Why not make the stack size a configurable item, either stored with the program or specified when a process creates another process?
There are any number of ways you can make this configurable.
There's a guideline that states "0, 1 or n", meaning you should allow zero, one or any number (limited by other constraints such as memory) of an object - this applies to sizes of objects as well.