内核开发中使用的堆栈大小

发布于 2024-07-06 09:53:11 字数 684 浏览 8 评论 0原文

我正在开发一个操作系统,而不是对内核进行编程,我正在设计内核。 该操作系统针对的是 x86 架构,而我的目标是现代计算机。 预计所需 RAM 数量为 256Mb 或更多。

系统上每个线程运行的堆栈大小最好是多少? 我是否应该尝试以这样的方式设计系统:如果达到最大长度,堆栈可以自动扩展?

我想,如果我没记错的话,RAM 中的一个页面是 4k 或 4096 字节,这对我来说似乎并不多。 我确实可以看到,特别是在使用大量递归时,我希望 RAM 中同时拥有超过 1000 个整数。 现在,真正的解决方案是让程序通过使用 malloc 来执行此操作并管理自己的内存资源,但实际上我想知道用户对此的看法。

4k 足够大来容纳现代计算机程序吗? 堆栈应该比这个大吗? 堆栈是否应该自动扩展以适应任何类型的大小? 从实际开发人员的角度和安全的角度来看,我对此都很感兴趣。

4k 对于堆栈来说太大了吗? 考虑到正常的程序执行,特别是从 C++ 中的类的角度来看,我注意到好的源代码往往会在创建类时使用 malloc/new 所需的数据,以最大程度地减少乱扔的数据在函数调用中。

我什至没有了解处理器缓存的大小。 理想情况下,我认为堆栈将驻留在缓存中以加快速度,但我不确定是否需要实现这一点,或者处理器是否可以为我处理它。 我只是打算使用普通无聊的旧内存进行测试。 我无法决定。 有什么选择?

I'm developing an operating system and rather than programming the kernel, I'm designing the kernel. This operating system is targeted at the x86 architecture and my target is for modern computers. The estimated number of required RAM is 256Mb or more.

What is a good size to make the stack for each thread run on the system? Should I try to design the system in such a way that the stack can be extended automatically if the maximum length is reached?

I think if I remember correctly that a page in RAM is 4k or 4096 bytes and that just doesn't seem like a lot to me. I can definitely see times, especially when using lots of recursion, that I would want to have more than 1000 integars in RAM at once. Now, the real solution would be to have the program doing this by using malloc and manage its own memory resources, but really I would like to know the user opinion on this.

Is 4k big enough for a stack with modern computer programs? Should the stack be bigger than that? Should the stack be auto-expanding to accommodate any types of sizes? I'm interested in this both from a practical developer's standpoint and a security standpoint.

Is 4k too big for a stack? Considering normal program execution, especially from the point of view of classes in C++, I notice that good source code tends to malloc/new the data it needs when classes are created, to minimize the data being thrown around in a function call.

What I haven't even gotten into is the size of the processor's cache memory. Ideally, I think the stack would reside in the cache to speed things up and I'm not sure if I need to achieve this, or if the processor can handle it for me. I was just planning on using regular boring old RAM for testing purposes. I can't decide. What are the options?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

策马西风 2024-07-13 09:53:11

堆栈大小取决于线程正在执行的操作。 我的建议:

  • 在线程创建时将堆栈大小作为参数(不同的线程将执行不同的操作,因此需要不同的堆栈大小)
  • 为那些不想指定堆栈大小(4K)的人提供合理的默认值吸引了我的控制狂,因为它会导致堆栈浪费,呃,很快得到信号)
  • 考虑如何检测和处理堆栈溢出。 检测可能很棘手。 您可以将保护页(空)放在堆栈的末尾,这通常会起作用。 但你要依靠坏线程的行为来避免越过护城河并开始污染更远的地方。 一般来说,这种情况不会发生……但是,这就是让真正棘手的错误变得棘手的原因。 密封机制涉及破解编译器以生成堆栈检查代码。 至于处理堆栈溢出,您需要在其他地方有一个专用堆栈,在该堆栈上运行有问题的线程(或其守护天使,无论您决定是谁,毕竟您是操作系统设计者)。
  • 我强烈建议用独特的模式标记堆栈的末端,这样当您的线程越过末端时(而且总是如此),您至少可以进行事后分析,看看确实有东西跑出了它的末端。堆。 0xDEADBEEF 或类似的页面很方便。

顺便说一句,x86 页面大小通常为 4k,但并非必须如此。 您可以选择 64k 尺寸甚至更大。 使用较大页面的通常原因是为了避免 TLB 未命中。 同样,我会将其设为内核配置或运行时参数。

Stack size depends on what your threads are doing. My advice:

  • make the stack size a parameter at thread creation time (different threads will do different things, and hence will need different stack sizes)
  • provide a reasonable default for those who don't want to be bothered with specifying a stack size (4K appeals to the control freak in me, as it will cause the stack-profligate to, er, get the signal pretty quickly)
  • consider how you will detect and deal with stack overflow. Detection can be tricky. You can put guard pages--empty--at the ends of your stack, and that will generally work. But you are relying on the behavior of the Bad Thread not to leap over that moat and start polluting what lays beyond. Generally that won't happen...but then, that's what makes the really tough bugs tough. An airtight mechanism involves hacking your compiler to generate stack checking code. As for dealing with a stack overflow, you will need a dedicated stack somewhere else on which the offending thread (or its guardian angel, whoever you decide that is--you're the OS designer, after all) will run.
  • I would strongly recommend marking the ends of your stack with a distinctive pattern, so that when your threads run over the ends (and they always do), you can at least go in post-mortem and see that something did in fact run off its stack. A page of 0xDEADBEEF or something like that is handy.

By the way, x86 page sizes are generally 4k, but they do not have to be. You can go with a 64k size or even larger. The usual reason for larger pages is to avoid TLB misses. Again, I would make it a kernel configuration or run-time parameter.

指尖凝香 2024-07-13 09:53:11

在linux内核源代码中搜索KERNEL_STACK_SIZE,您会发现它非常依赖于体系结构 - PAGE_SIZE或2 * PAGE_SIZE等(下面只是一些结果 - 许多中间输出被删除)。

./arch/cris/include/asm/processor.h:
#define KERNEL_STACK_SIZE PAGE_SIZE

./arch/ia64/include/asm/ptrace.h:
# define KERNEL_STACK_SIZE_ORDER        3
# define KERNEL_STACK_SIZE_ORDER        2
# define KERNEL_STACK_SIZE_ORDER        1
# define KERNEL_STACK_SIZE_ORDER        0
#define IA64_STK_OFFSET         ((1 << KERNEL_STACK_SIZE_ORDER)*PAGE_SIZE)
#define KERNEL_STACK_SIZE       IA64_STK_OFFSET

./arch/ia64/include/asm/mca.h:
    u64 mca_stack[KERNEL_STACK_SIZE/8];
    u64 init_stack[KERNEL_STACK_SIZE/8];

./arch/ia64/include/asm/thread_info.h:
#define THREAD_SIZE         KERNEL_STACK_SIZE

./arch/ia64/include/asm/mca_asm.h:
#define MCA_PT_REGS_OFFSET      ALIGN16(KERNEL_STACK_SIZE-IA64_PT_REGS_SIZE)

./arch/parisc/include/asm/processor.h:
#define KERNEL_STACK_SIZE   (4*PAGE_SIZE)

./arch/xtensa/include/asm/ptrace.h:
#define KERNEL_STACK_SIZE (2 * PAGE_SIZE)

./arch/microblaze/include/asm/processor.h:
# define KERNEL_STACK_SIZE  0x2000

Search for KERNEL_STACK_SIZE in linux kernel source code and you will find that it is very much architecture dependent - PAGE_SIZE, or 2*PAGE_SIZE etc (below is just some results - many intermediate output are deleted).

./arch/cris/include/asm/processor.h:
#define KERNEL_STACK_SIZE PAGE_SIZE

./arch/ia64/include/asm/ptrace.h:
# define KERNEL_STACK_SIZE_ORDER        3
# define KERNEL_STACK_SIZE_ORDER        2
# define KERNEL_STACK_SIZE_ORDER        1
# define KERNEL_STACK_SIZE_ORDER        0
#define IA64_STK_OFFSET         ((1 << KERNEL_STACK_SIZE_ORDER)*PAGE_SIZE)
#define KERNEL_STACK_SIZE       IA64_STK_OFFSET

./arch/ia64/include/asm/mca.h:
    u64 mca_stack[KERNEL_STACK_SIZE/8];
    u64 init_stack[KERNEL_STACK_SIZE/8];

./arch/ia64/include/asm/thread_info.h:
#define THREAD_SIZE         KERNEL_STACK_SIZE

./arch/ia64/include/asm/mca_asm.h:
#define MCA_PT_REGS_OFFSET      ALIGN16(KERNEL_STACK_SIZE-IA64_PT_REGS_SIZE)

./arch/parisc/include/asm/processor.h:
#define KERNEL_STACK_SIZE   (4*PAGE_SIZE)

./arch/xtensa/include/asm/ptrace.h:
#define KERNEL_STACK_SIZE (2 * PAGE_SIZE)

./arch/microblaze/include/asm/processor.h:
# define KERNEL_STACK_SIZE  0x2000
谈情不如逗狗 2024-07-13 09:53:11

我将投入两分钱来让球滚动:

  • 我不确定“典型”堆栈大小是多少。 我猜想每个线程可能是 8 KB,如果线程超过这个数量,就抛出异常。 但是,根据 ,Windows 有一个每个线程默认保留堆栈大小为 1MB,但不会一次性全部提交(页面在需要时提交)。 此外,您可以在编译时使用编译器指令为给定的 EXE 请求不同的堆栈大小。 不确定 Linux 是做什么的,但我看到了对 4 KB 堆栈的引用(尽管我认为这可以在编译内核时更改,并且我不确定默认堆栈大小是多少......)

  • 这与与第一点一致。 您可能希望对每个线程可以获得的堆栈量有一个固定的限制。 程序将耗尽所有可用内存。

I'll throw my two cents in to get the ball rolling:

  • I'm not sure what a "typical" stack size would be. I would guess maybe 8 KB per thread, and if a thread exceeds this amount, just throw an exception. However, according to this, Windows has a default reserved stack size of 1MB per thread, but it isn't committed all at once (pages are committed as they are needed). Additionally, you can request a different stack size for a given EXE at compile-time with a compiler directive. Not sure what Linux does, but I've seen references to 4 KB stacks (although I think this can be changed when you compile the kernel and I'm not sure what the default stack size is...)

  • This ties in with the first point. You probably want a fixed limit on how much stack each thread can get. Thus, you probably don't want to automatically allocate more stack space every time a thread exceeds its current stack space, because a buggy program that gets stuck in an infinite recursion is going to eat up all available memory.

二智少女猫性小仙女 2024-07-13 09:53:11

如果您使用虚拟内存,您确实希望使堆栈可增长。 强制静态分配堆栈大小,就像 Qthreads 和 Windows Fibers 等用户级线程中常见的那样,是一团糟。 使用困难,容易崩溃。 所有现代操作系统都会动态地增长堆栈,我认为通常是通过在当前堆栈指针下方设置一个或两个写保护保护页。 在那里写入,然后告诉操作系统堆栈已低于其分配的空间,并且您在其下方分配一个新的保护页面并使命中的页面可写。 只要没有单个函数分配超过一页的数据,就可以正常工作。 或者您可以使用两个或四个保护页来允许更大的堆栈帧。

如果您想要一种控制堆栈大小的方法,并且您的目标是一个真正受控且高效的环境,但不关心与 Linux 等相同风格的编程,请选择每次启动一个任务的单次执行模型检测到相关事件,运行直至完成,然后将任何持久数据存储在其任务数据结构中。 这样,所有线程就可以共享一个堆栈。 用于许多用于汽车控制等的超薄实时操作系统。

If you are using virtual memory, you do want to make the stack growable. Forcing static allocation of stack sized, like is common in user-level threading like Qthreads and Windows Fibers is a mess. Hard to use, easy to crash. All modern OSes do grow the stack dynamically, I think usually by having a write-protected guard page or two below the current stack pointer. Writes there then tell the OS that the stack has stepped below its allocated space, and you allocate a new guard page below that and make the page that got hit writable. As long as no single function allocates more than a page of data, this works fine. Or you can use two or four guard pages to allow larger stack frames.

If you want a way to control stack size and your goal is a really controlled and efficient environment, but do not care about programming in the same style as Linux etc., go for a single-shot execution model where a task is started each time a relevant event is detected, runs to completion, and then stores any persistent data in its task data structure. In this way, all threads can share a single stack. Used in many slim real-time operating systems for automotive control and similar.

蓝眼泪 2024-07-13 09:53:11

为什么不将堆栈大小设置为可配置项,或者与程序一起存储,或者在一个进程创建另一个进程时指定?

您可以通过多种方式进行配置。

有一个准则规定“0、1 或 n”,这意味着您应该允许零个、一个或任何数量(受内存等其他约束限制)的对象 - 这也适用于对象的大小。

Why not make the stack size a configurable item, either stored with the program or specified when a process creates another process?

There are any number of ways you can make this configurable.

There's a guideline that states "0, 1 or n", meaning you should allow zero, one or any number (limited by other constraints such as memory) of an object - this applies to sizes of objects as well.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文