除了 malloc/free 之外,程序还需要操作系统提供其他东西吗?

发布于 2024-07-06 09:38:37 字数 1074 浏览 10 评论 0原文

我正在为我正在开发的操作系统设计内核(我实际上将其称为“核心”,只是为了有所不同,但它基本上是相同的)。 如果我无法启动和运行多任务处理、内存管理和其他基本功能,那么操作系统本身的细节就无关紧要了,所以我需要首先解决这个问题。 我有一些关于设计 malloc 例程的问题。

我认为 malloc() 要么成为内核本身的一部分(我倾向于这一点),要么成为程序的一部分,但我必须编写自己的 C 标准库实现方式,所以我开始写一个malloc。 我的问题实际上在这方面相当简单,C(或C++)如何管理它的堆?

我在理论课上一直被教导的是,堆是一块不断扩展的内存,从指定的地址开始,并且在很多意义上表现得像堆栈。 通过这种方式,我知道在全局作用域中声明的变量位于开头,更多的变量在各自的作用域中声明时被“推”到堆上,而超出作用域的变量则简单地留在内存空间中,但该空间被标记为空闲,因此堆可以在需要时扩展更多。

我需要知道的是,C 究竟是如何以这种方式处理动态扩展的堆的? 编译后的 C 程序是否会自己调用 malloc 例程并处理自己的堆,或者我是否需要为其提供自动扩展的空间? 另外,C 程序如何知道堆从哪里开始?

哦,我知道相同的概念也适用于其他语言,但我希望所有示例都采用 C/C++ 语言,因为我对这种语言最熟悉。 我也不想担心其他事情,例如堆栈,因为我认为我能够自己处理这样的事情。

所以我想我真正的问题是,除了 malloc/free (它处理自身的获取和释放页面等)之外,程序还需要操作系统提供其他东西吗?

谢谢!

编辑 我对 C 如何使用与堆相关的 malloc 更感兴趣,而不是 malloc 例程本身的实际工作原理。 如果有帮助的话,我在 x86 上执行此操作,但 C 是交叉编译器,所以这应该不重要。 ^_^

进一步编辑:我知道我可能会混淆术语。 我被告知“堆”是程序存储全局/局部变量等内容的地方。 我习惯于在汇编编程中处理“堆栈”,并且我刚刚意识到我的意思可能是这个。 我的一些研究表明,“堆”更常用来指代程序为其自身分配的总内存,或者操作系统提供的内存页面总数(和顺序)。

那么,考虑到这一点,我该如何处理不断扩大的堆栈? (看来我的 C 理论课确实有轻微的……缺陷。)

I'm working on designing the kernel (which I'm going to actually call the "core" just to be different, but its basically the same) for an OS I'm working on. The specifics of the OS itself are irrelevant if I can't get multi-tasking, memory management, and other basic things up and running, so I need to work on that first. I've some questinos about designing a malloc routine.

I figure that malloc() is either going to be a part of the kernel itself (I'm leaning towards this) or a part of the program, but I'm going to have to write my own implementation of the C standard library either way, so I get to write a malloc. My question is actually rather simple in this regard, how does C (or C++) manage its heap?

What I've always been taught in theorey classes is that the heap is an ever expanding piece of memory, starting at a specified address, and in a lot of senses behaving like a stack. In this way, I know that variables declared in global scope are at the beginning, and more variables are "pushed" onto the heap as they are declared in their respective scopes, and variables that go out of scope are simply left in memory space, but that space is marked as free so the heap can expand more if it needs to.

What I need to know is, how on earth does C actually handle a dynamically expanding heap in this manner? Does a compiled C program make its own calls to a malloc routine and handle its own heap, or do I need to provide it with an automatically expanding space? Also, how does the C program know where the heap begins?

Oh, and I know that the same concepts apply to other languages, but I would like any examples to be in C/C++ because I'm most comfortable with that language. I also would like to not worry about other things such as the stack, as I think I'm able to handle things like this on my own.

So I suppose my real question is, other than malloc/free (which handles getting and freeing pages for itself, etc) does a program need the OS to provide anything else?

Thanks!

EDIT I'm more interested in how C uses malloc in relation with the heap than in the actual workings of the malloc routine itself. If it helps, I'm doing this on x86, but C is cross compiler so it shouldn't matter. ^_^

EDIT FURTHER: I understand that I may be getting terms confused. I was taught that the "heap" was where the program stored things like global/local variables. I'm used to dealing with a "stack" in assembly programming, and I just realized that I probably mean that instead. A little research on my part shows that "heap" is more commonly used to refer to the total memory that a program has allocated for itself, or, the total number (and order) of pages of memory the OS has provided.

So, with that in mind, how do I deal with an ever expanding stack? (it does appear that my C theory class was mildly... flawed.)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

晨光如昨 2024-07-13 09:38:37

malloc 通常在用户空间的 C 运行时中实现,依靠特定的操作系统系统调用来映射虚拟内存页面。 mallocfree 的工作是管理那些大小固定(通常为 4 KB,但有时更大)的内存页,并将它们切片和切块应用程序可以使用的部分。

例如,请参阅 GNU libc 实现。

要获得更简单的实现,请查看去年的 MIT 操作系统 课程。 具体请参阅最终实验讲义 ,并查看 lib/malloc.c。 这段代码使用了JOS类中开发的操作系统。 它的工作方式是读取页表(由操作系统提供只读),查找未映射的虚拟地址范围。 然后,它使用 sys_page_allocsys_page_unmap 系统调用将页面映射到当前进程并取消映射到当前进程。

malloc is generally implemented in the C runtime in userspace, relying on specific OS system calls to map in pages of virtual memory. The job of malloc and free is to manage those pages of memory, which are fixed in size (typically 4 KB, but sometimes bigger), and to slice and dice them into pieces that applications can use.

See, for example, the GNU libc implementation.

For a much simpler implementation, check out the MIT operating systems class from last year. Specifically, see the final lab handout, and take a look at lib/malloc.c. This code uses the operating system JOS developed in the class. The way it works is that it reads through the page tables (provided read-only by the OS), looking for unmapped virtual address ranges. It then uses the sys_page_alloc and sys_page_unmap system calls to map and unmap pages into the current process.

云胡 2024-07-13 09:38:37

有多种方法可以解决这个问题。

大多数情况下,C 程序都有自己的 malloc/free 功能。 那个适用于小物体。 最初(一旦内存耗尽)内存管理器将向操作系统请求更多内存。 执行此操作的传统方法是 unix 变体上的 mmap 和 sbrk(Win32 上的 GlobalAlloc / LocalAlloc)。

我建议您查看 Doug Lea 内存分配器(谷歌:dlmalloc )从内存提供者(例如操作系统)的角度来看。 该分配器是一款非常优秀的分配器,并且具有适用于所有主要操作系统的挂钩。 如果您想知道高性能分配器对操作系统的期望,那么此代码是您的首选。

There are multiple ways to tackle the problem.

Most often C programs have their own malloc/free functionality. That one will work for the small objects. Initially (and as soon as the memory is exhausted) the memory manager will ask the OS for more memory. Traditional methods to do this are mmap and sbrk on the unix variants (GlobalAlloc / LocalAlloc on Win32).

I suggest that you take a look at the Doug Lea memory allocator (google: dlmalloc) from a memory provider (e.g. OS) point of view. That allocator is top notch in a very good one and has hooks for all major operation system. If you want to know what a high performance allocator expects from an OS this is code is your first choice.

叹梦 2024-07-13 09:38:37

您是否混淆了堆和堆栈?

我问这个问题是因为你提到了“一块不断扩大的内存”、作用域以及在声明变量时将变量推送到堆上。 这听起来确实像是您实际上在谈论堆栈。

在最常见的 C 实现中,自动变量的声明如

int i;

通常会导致 i 被分配到堆栈上。 一般来说,除非您显式调用它,或者您进行的某些库调用调用它,否则 malloc 不会参与其中。

我建议您查看 Peter Van Der Linden 撰写的“专家 C 编程”,了解 C 程序通常如何使用堆栈和堆的背景知识。

Are you confusing the heap and the stack?

I ask because you mention "an ever expanding piece of memory", scope and pushing variables on the heap as they are declared. That sure sounds like you are actually talking about the stack.

In the most common C implementations declarations of automatic variables like

int i;

are generally going to result in i being allocated on the stack. In general malloc won't get involved unless you explicitly invoke it, or some library call you make invokes it.

I'd recommend looking at "Expert C Programming" by Peter Van Der Linden for background on how C programs typically work with the stack and the heap.

尬尬 2024-07-13 09:38:37

必读:Knuth - 计算机编程艺术,第 1 卷,第 2 章,第 2.5 节。 否则,你可以阅读 Kernighan & Ritchie“C 编程语言”查看实现; 或者,您可以阅读 Plauger“标准 C 库”以查看另一个实现。

我相信你在核心内部需要做的事情与核心外部的程序所看到的会有些不同。 特别是,程序的核心内存分配将处理虚拟内存等,而代码外部的程序只是看到核心提供的结果。

Compulsory reading: Knuth - Art of Computer Programming, Volume 1, Chapter 2, Section 2.5. Otherwise, you could read Kernighan & Ritchie "The C Programming Language" to see an implementation; or, you could read Plauger "The Standard C Library" to see another implementation.

I believe that what you need to do inside your core will be somewhat different from what the programs outside the core see. In particular, the in-core memory allocation for programs will be dealing with virtual memory, etc, whereas the programs outside the code simply see the results of what the core has provided.

已下线请稍等 2024-07-13 09:38:37

了解虚拟内存管理(分页)。 它是高度 CPU 特定的,并且每个操作系统都专门针对每个受支持的 CPU 实施 VM 管理。 如果您正在为 x86/amd64 编写操作系统,请阅读它们各自的手册。

Read about virtual memory management (paging). It's highly CPU-specific, and every OS implements VM management specially for every supported CPU. If you're writing your OS for x86/amd64, read their respective manuals.

神经大条 2024-07-13 09:38:37

一般来说,C 库处理 malloc 的实现,从操作系统请求内存(通过匿名 mmap 或在较旧的系统中,sbrk)有必要的。 因此,您的内核方面应该通过类似这些方法之一来处理整个页面的分配。

然后由 malloc 分配内存,但不会使可用内存碎片过多。 不过,我不太了解这个细节; 然而,我想到了“arena”这个词。 如果我能找到参考,我会更新这篇文章。

Generally, the C library handles the implementation of malloc, requesting memory from the OS (either via anonymous mmap or, in older systems, sbrk) as necessary. So your kernel side of things should handle allocating whole pages via something like one of those means.

Then it's up to malloc to dole out memory in a way that doesn't fragment the free memory too much. I'm not too au fait with the details of this, though; however, the term arena comes to mind. If I can hunt down a reference, I'll update this post.

晨光如昨 2024-07-13 09:38:37

危险危险!! 如果您甚至考虑尝试内核开发,您应该非常清楚资源的成本及其相对有限的可用性...

关于递归的一件事是,它非常非常昂贵(至少在内核中)土地),你不会看到许多函数被编写为简单地继续运行,否则你的内核将会崩溃。

为了强调我的观点,(在 stackoverflow.com 嘿),请查看 NT 调试博客 关于内核堆栈溢出,具体来说,

· 在基于 x86 的平台上,
内核模式堆栈为12K

· 在基于 x64 的平台上,
内核模式堆栈为24K。 (基于 x64
平台包括系统
使用 AMD64 的处理器
架构和处理器使用
Intel EM64T 架构)。

· 在基于 Itanium 的平台上,
内核模式堆栈为 32K,其中 32K
后备存储。

这确实不是很多;

常见嫌疑人

<小时>

1。 自由地使用堆栈。

2。 递归调用函数。

如果您稍微阅读一下该博客,您就会发现内核开发有多么困难,并且存在一系列相当独特的问题。 你的理论课没有错,很简单。 ;)

从理论出发 -> 内核开发与上下文切换尽可能重要(也许可以在混合中节省一些虚拟机管理程序交互!!)。

无论如何,永远不要假设、验证和测试你的期望。

Danger Danger!! If your even considering attempting kernel development, you should be very aware of the cost of your resources and their relatively limited availability...

One thing about recursion, is that it's very, expensive (at least in kernel land), you're not going to see many functions written to simply continue unabaided, or else your kernel will panic.

To underscore my point here, (at stackoverflow.com heh), check out this post from the NT Debugging blog about kernel stack overflow's, specificially,

· On x86-based platforms, the
kernel-mode stack is 12K.

· On x64-based platforms, the
kernel-mode stack is 24K. (x64-based
platforms include systems with
processors using the AMD64
architecture and processors using the
Intel EM64T architecture).

· On Itanium-based platforms, the
kernel-mode stack is 32K with a 32K
backing store.

That's really, not a whole lot;

The Usual Suspects


1. Using the stack liberally.

2. Calling functions recursively.

If you read over the blog a bit, you will see how hard kernel development can be with a rather unique set of issues. You're theory class was not wrong, it was simply, simple. ;)

To go from theory -> kernel development is about as significant of a context switch as is possible (perhaps save some hypervisor interaction in the mix!!).

Anyhow, never assume, validate and test your expectations.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文