当前位置：文江博客话题详情

operating-system linux-kernel terminology kernel

内核空间和用户空间有什么区别？

发布于 2024-11-06 07:31:37 字数 63 浏览 1 评论 0原文

内核空间和用户空间有什么区别？内核空间、内核线程、内核进程和内核堆栈意思相同吗？另外，为什么我们需要这种差异化？

收藏 0

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

评论（16）

深居我梦 2024-11-13 07:31:37

真正简单的答案是内核在内核空间中运行，普通程序在用户空间中运行。用户空间基本上是沙箱的一种形式——它限制用户程序，使它们不能干扰其他程序或操作系统内核拥有的内存（和其他资源）。这限制了（但通常不会完全消除）他们做坏事的能力，例如使机器崩溃。

内核是操作系统的核心。它通常可以完全访问所有内存和机器硬件（以及机器上的其他所有内容）。为了使机器尽可能稳定，您通常只希望最值得信赖、经过充分测试的代码在内核模式/内核空间中运行。

堆栈只是内存的另一部分，因此它自然与内存的其余部分一起隔离。

回复收藏 0 原文

深巷少女 2024-11-13 07:31:37

随机存取存储器（RAM）在逻辑上可以分为两个不同的区域，即内核空间和用户空间。（RAM的物理地址实际上并未划分为虚拟地址，所有这些都是由 MMU 实现的）

内核在有权使用它的内存部分中运行。这部分内存不能被普通用户的进程直接访问，而内核可以访问这部分内存。要访问内核的某些部分，用户进程必须使用预定义的系统调用，即 open、read、write 等。 printf等C库函数依次调用系统调用write。

系统调用充当用户进程和内核进程之间的接口。访问权限被放置在内核空间上，以防止用户在不知情的情况下干扰内核。

因此，当系统调用发生时，软件中断被发送到内核。 CPU 可以暂时将控制权移交给相关的中断处理程序。中断处理程序完成其工作后，由中断暂停的内核进程将恢复。

回复收藏 0 原文

初心未许 2024-11-13 07:31:37

CPU环是最明显的区别

在x86保护模式下，CPU始终处于4个环之一。 Linux 内核只使用 0 和 3：

的 0
内核3

供用户使用，这是内核与用户态最硬、最快速的定义。

为什么 Linux 不使用环 1 和 2：CPU 权限环：为什么不使用环 1 和 2？

当前环是如何确定的？

当前环是通过以下组合选择的：

全局描述符表：a内存中的 GDT 条目表，每个条目都有一个对环进行编码的字段 Privl。
LGDT指令将地址设置为当前描述符表。
另请参阅：http://wiki.osdev.org/Global_Descriptor_Table
段寄存器 CS、DS等等，它们指向 GDT 中条目的索引。
例如，CS = 0 表示 GDT 的第一个条目当前对于执行代码处于活动状态。

每个环可以做什么？

CPU 芯片的物理构建使得：

环 0 可以做任何事情
环 3 不能运行多个指令并写入到几个寄存器，最值得注意的是：
- 无法改变自己的戒指！否则，它可以将自己设置为ring 0，而rings将毫无用处。
  换句话说，无法修改当前的段描述符，它决定当前的环。< /p>
- 无法修改页表：x86 分页如何工作？
  换句话说，无法修改CR3寄存器，并且分页本身会阻止页表的修改。
  出于安全性/易于编程的原因，这可以防止一个进程看到其他进程的内存。
- 无法注册中断处理程序。这些是通过写入内存位置来配置的，这也可以通过分页来防止。
  处理程序在环 0 中运行，会破坏安全模型。
  也就是说，不能使用LGDT和LIDT指令。
- 无法执行in和out等IO指令，因此可以进行任意硬件访问。
  否则，例如，如果任何程序可以直接从磁盘读取，文件权限将毫无用处。
  更准确地说，感谢Michael Petch：操作系统实际上有可能允许环上的 IO 指令3、这实际上是由任务状态段控制的。
  环 3 不可能授予自己这样做的许可（如果它一开始就没有这样做的话）。
  Linux 总是不允许它。另请参阅：为什么 Linux 不使用通过 TSS 进行硬件上下文切换？

程序和操作系统如何在环之间转换？

当 CPU 打开时，它开始运行环 0 中的初始程序（很好）的，但它是一个很好的近似）。您可以认为这个初始程序是内核（但它通常是然后调用仍在环 0 中的内核的引导加载程序）。
当用户态进程希望内核为其执行某些操作（例如写入文件）时，它会使用生成中断的指令，例如 int 0x80 或 syscall 向内核发出信号。 x86-64 Linux 系统调用 hello world 示例：
<前><代码>.data
你好世界：
.ascii“你好世界\n”
你好世界长度 = . - 你好世界
。文本
.global_start
_开始：
/* 写 */
移动 $1，%rax
移动 $1，%rdi
移动 $hello_world, %rsi
移动 $hello_world_len, %rdx
系统调用
/* 出口 */
移动 $60, %rax
移动$0，%rdi
系统调用
编译并运行：
```
as -o hello_world.o hello_world.S
ld -o hello_world.out hello_world.o
./hello_world.out
```
GitHub 上游.
发生这种情况时，CPU 会调用内核在启动时注册的中断回调处理程序。这是一个具体的裸机示例，用于注册处理程序并使用它.
，它决定内核是否允许此操作、执行该操作并在环 3.x86_64 中重新启动用户态程序
当使用 exec 系统调用时（或当内核启动时/init），内核准备好新用户态进程的寄存器和内存，然后跳转到入口点并将CPU切换为ring 3
如果程序尝试做一些顽皮的事情，例如写入禁止寄存器或内存地址（因为分页），CPU 还会在环 0 中调用一些内核回调处理程序。
但是由于用户态很顽皮，内核这次可能会杀死进程，或者用信号给它一个警告。
当内核启动时，它会设置一个具有固定频率的硬件时钟，该时钟会定期生成中断。
该硬件时钟生成运行环 0 的中断，并允许它安排唤醒哪些用户态进程。
这样，即使进程没有进行任何系统调用，也可以进行调度。

拥有多个环有什么意义？

分离内核和用户空间有两个主要优点：

更容易编写程序，因为您更确定一个不会干扰另一个。例如，一个用户态进程不必担心由于分页而覆盖另一程序的内存，也不必担心将硬件置于另一进程的无效状态。
它更安全。例如，文件权限和内存分离可以防止黑客应用程序读取您的银行数据。当然，这假设您信任内核。

如何使用它？

我创建了一个裸机设置，这应该是直接操作环的好方法：https://github.com/cirosantilli/x86-bare-metal-examples

不幸的是，我没有耐心制作用户态示例，但我做到了尽可能进行分页设置，所以用户态应该是可行的。我很想看到拉取请求。

或者，Linux 内核模块在环 0 中运行，因此您可以使用它们来尝试特权操作，例如读取控制寄存器：如何访问控制寄存器来自程序的 cr0、cr2、cr3？获取分段错误

这是一个方便的方法QEMU + Buildroot 设置可以在不杀死主机的情况下进行尝试。

内核模块的缺点是其他 kthread 正在运行，可能会干扰您的实验。但理论上你可以用你的内核模块接管所有中断处理程序并拥有系统，这实际上是一个有趣的项目。

负环

虽然英特尔手册中实际上并未提及负环，但实际上存在比环 0 本身具有更多功能的 CPU 模式，因此非常适合“负环”名称。

一个例子是虚拟化中使用的管理程序模式。

有关更多详细信息，请参阅：

ARM

在 ARM 中，环被称为异常级别，但主要思想保持不变。

ARMv8 中存在 4 个异常级别，常用为：

EL0：用户态
EL1：内核（ARM 术语中的“主管”）。
使用 svc 指令（SuperVisor Call）输入，以前称为 swi 统一汇编之前，这是用来制作的指令Linux 系统调用。你好世界ARMv8示例：
你好。
<前><代码>.文本
.global_start
_开始：
/* 写 */
移动 x0, 1
ldr x1, =消息
ldr x2，=len
移动 x8, 64
服务0
/* 出口 */
移动 x0, 0
移动 x8, 93
服务0
消息：
.ascii“你好系统调用 v8\n”
长度 = . - 味精
GitHub 上游。< /p>
在 Ubuntu 16.04 上使用 QEMU 进行测试：
```
sudo apt-get install qemu-user gcc-arm-linux-gnueabihf
arm-linux-gnueabihf-as -o hello.o hello.S
arm-linux-gnueabihf-ld -o 你好你好.o
qemu-arm 你好
```
这是一个具体的裸机示例，注册一个 SVC 处理程序并执行 SVC 调用。
EL2：虚拟机管理程序，例如Xen。
使用 hvc 指令（HyperVisor 调用）输入。
虚拟机管理程序对于操作系统来说，就像操作系统对于用户空间一样。
例如，Xen 允许您在同一系统上同时运行多个操作系统，例如 Linux 或 Windows，并且它将操作系统彼此隔离以确保安全性和易于调试，就像 Linux 对用户态程序所做的那样。
虚拟机管理程序是当今云基础设施的关键部分：它们允许多个服务器在单个硬件上运行，使硬件使用率始终接近 100%，并节省大量资金。
例如，AWS 在 2017 年之前一直使用 Xen，当时其迁移到 KVM 成为新闻.
EL3：另一个级别。 TODO 示例。
通过 smc 指令（安全模式调用）输入

ARMv8 架构参考模型 DDI 0487C.a - 章节D1 - AArch64 系统级程序员模型 - 图 D1-1 精美地说明了这一点：

随着 ARMv8.1 虚拟化主机扩展 (VHE)。此扩展允许内核在 EL2 中高效运行：

VHE 的创建是因为 Linux 内核虚拟化解决方案（例如 KVM）已经超越 Xen（参见上面提到的 AWS 转向 KVM），因为大多数客户只需要Linux VM，正如您可以想象的那样，KVM 都在一个项目中，因此比 Xen 更简单且可能更高效。因此，现在主机 Linux 内核在这些情况下充当虚拟机管理程序。

请注意，也许是出于事后诸葛亮的考虑，ARM 对权限级别的命名约定比 x86 更好，而不需要负级别：0 表示较低，3 表示最高。较高级别往往比较低级别更容易创建。

可以使用MRS指令查询当前的EL：当前执行模式/异常级别是什么？

ARM 不要求所有异常级别都存在以允许不需要该功能来节省芯片面积的实现。 ARMv8“异常级别”说：

实现可能不包括所有异常级别。所有实现都必须包括 EL0 和 EL1。
EL2和EL3是可选的。

例如，QEMU 默认为 EL1，但可以使用命令行选项启用 EL2 和 EL3：qemu-system-aarch64 在模拟 a53 power up 时进入 el1

在 Ubuntu 上测试的代码片段18.10。

CPU rings are the most clear distinction

In x86 protected mode, the CPU is always in one of 4 rings. The Linux kernel only uses 0 and 3:

0 for kernel
3 for users

This is the most hard and fast definition of kernel vs userland.

Why Linux does not use rings 1 and 2: CPU Privilege Rings: Why rings 1 and 2 aren't used?

How is the current ring determined?

The current ring is selected by a combination of:

global descriptor table: a in-memory table of GDT entries, and each entry has a field Privl which encodes the ring.
The LGDT instruction sets the address to the current descriptor table.
See also: http://wiki.osdev.org/Global_Descriptor_Table
the segment registers CS, DS, etc., which point to the index of an entry in the GDT.
For example, CS = 0 means the first entry of the GDT is currently active for the executing code.

What can each ring do?

The CPU chip is physically built so that:

ring 0 can do anything
ring 3 cannot run several instructions and write to several registers, most notably:
- cannot change its own ring! Otherwise, it could set itself to ring 0 and rings would be useless.
  In other words, cannot modify the current segment descriptor, which determines the current ring.
- cannot modify the page tables: How does x86 paging work?
  In other words, cannot modify the CR3 register, and paging itself prevents modification of the page tables.
  This prevents one process from seeing the memory of other processes for security / ease of programming reasons.
- cannot register interrupt handlers. Those are configured by writing to memory locations, which is also prevented by paging.
  Handlers run in ring 0, and would break the security model.
  In other words, cannot use the LGDT and LIDT instructions.
- cannot do IO instructions like in and out, and thus have arbitrary hardware accesses.
  Otherwise, for example, file permissions would be useless if any program could directly read from disk.
  More precisely thanks to Michael Petch: it is actually possible for the OS to allow IO instructions on ring 3, this is actually controlled by the Task state segment.
  What is not possible is for ring 3 to give itself permission to do so if it didn't have it in the first place.
  Linux always disallows it. See also: Why doesn't Linux use the hardware context switch via the TSS?

How do programs and operating systems transition between rings?

when the CPU is turned on, it starts running the initial program in ring 0 (well kind of, but it is a good approximation). You can think this initial program as being the kernel (but it is normally a bootloader that then calls the kernel still in ring 0).
when a userland process wants the kernel to do something for it like write to a file, it uses an instruction that generates an interrupt such as int 0x80 or syscall to signal the kernel. x86-64 Linux syscall hello world example:
```
.data
hello_world:
    .ascii "hello world\n"
    hello_world_len = . - hello_world
.text
.global _start
_start:
    /* write */
    mov $1, %rax
    mov $1, %rdi
    mov $hello_world, %rsi
    mov $hello_world_len, %rdx
    syscall

    /* exit */
    mov $60, %rax
    mov $0, %rdi
    syscall
```
compile and run:
```
as -o hello_world.o hello_world.S
ld -o hello_world.out hello_world.o
./hello_world.out
```
GitHub upstream.
When this happens, the CPU calls an interrupt callback handler which the kernel registered at boot time. Here is a concrete baremetal example that registers a handler and uses it.
This handler runs in ring 0, which decides if the kernel will allow this action, do the action, and restart the userland program in ring 3. x86_64
when the exec system call is used (or when the kernel will start /init), the kernel prepares the registers and memory of the new userland process, then it jumps to the entry point and switches the CPU to ring 3
If the program tries to do something naughty like write to a forbidden register or memory address (because of paging), the CPU also calls some kernel callback handler in ring 0.
But since the userland was naughty, the kernel might kill the process this time, or give it a warning with a signal.
When the kernel boots, it setups a hardware clock with some fixed frequency, which generates interrupts periodically.
This hardware clock generates interrupts that run ring 0, and allow it to schedule which userland processes to wake up.
This way, scheduling can happen even if the processes are not making any system calls.

What is the point of having multiple rings?

There are two major advantages of separating kernel and userland:

it is easier to make programs as you are more certain one won't interfere with the other. E.g., one userland process does not have to worry about overwriting the memory of another program because of paging, nor about putting hardware in an invalid state for another process.
it is more secure. E.g. file permissions and memory separation could prevent a hacking app from reading your bank data. This supposes, of course, that you trust the kernel.

How to play around with it?

I've created a bare metal setup that should be a good way to manipulate rings directly: https://github.com/cirosantilli/x86-bare-metal-examples

I didn't have the patience to make a userland example unfortunately, but I did go as far as paging setup, so userland should be feasible. I'd love to see a pull request.

Alternatively, Linux kernel modules run in ring 0, so you can use them to try out privileged operations, e.g. read the control registers: How to access the control registers cr0,cr2,cr3 from a program? Getting segmentation fault

Here is a convenient QEMU + Buildroot setup to try it out without killing your host.

The downside of kernel modules is that other kthreads are running and could interfere with your experiments. But in theory you can take over all interrupt handlers with your kernel module and own the system, that would be an interesting project actually.

Negative rings

While negative rings are not actually referenced in the Intel manual, there are actually CPU modes which have further capabilities than ring 0 itself, and so are a good fit for the "negative ring" name.

One example is the hypervisor mode used in virtualization.

For further details see:

ARM

In ARM, the rings are called Exception Levels instead, but the main ideas remain the same.

There exist 4 exception levels in ARMv8, commonly used as:

EL0: userland

EL1: kernel ("supervisor" in ARM terminology).

Entered with the svc instruction (SuperVisor Call), previously known as swi before unified assembly, which is the instruction used to make Linux system calls. Hello world ARMv8 example:

hello.S

.text
.global _start
_start:
    /* write */
    mov x0, 1
    ldr x1, =msg
    ldr x2, =len
    mov x8, 64
    svc 0

    /* exit */
    mov x0, 0
    mov x8, 93
    svc 0
msg:
    .ascii "hello syscall v8\n"
len = . - msg

GitHub upstream.

Test it out with QEMU on Ubuntu 16.04:

sudo apt-get install qemu-user gcc-arm-linux-gnueabihf
arm-linux-gnueabihf-as -o hello.o hello.S
arm-linux-gnueabihf-ld -o hello hello.o
qemu-arm hello

Here is a concrete baremetal example that registers an SVC handler and does an SVC call.

EL2: hypervisors, for example Xen.
Entered with the hvc instruction (HyperVisor Call).
A hypervisor is to an OS, what an OS is to userland.
For example, Xen allows you to run multiple OSes such as Linux or Windows on the same system at the same time, and it isolates the OSes from one another for security and ease of debug, just like Linux does for userland programs.
Hypervisors are a key part of today's cloud infrastructure: they allow multiple servers to run on a single hardware, keeping hardware usage always close to 100% and saving a lot of money.
AWS for example used Xen until 2017 when its move to KVM made the news.
EL3: yet another level. TODO example.
Entered with the smc instruction (Secure Mode Call)

The ARMv8 Architecture Reference Model DDI 0487C.a - Chapter D1 - The AArch64 System Level Programmer's Model - Figure D1-1 illustrates this beautifully:

The ARM situation changed a bit with the advent of ARMv8.1 Virtualization Host Extensions (VHE). This extension allows the kernel to run in EL2 efficiently:

VHE was created because in-Linux-kernel virtualization solutions such as KVM have gained ground over Xen (see e.g. AWS' move to KVM mentioned above), because most clients only need Linux VMs, and as you can imagine, being all in a single project, KVM is simpler and potentially more efficient than Xen. So now the host Linux kernel acts as the hypervisor in those cases.

Note how ARM, maybe due to the benefit of hindsight, has a better naming convention for the privilege levels than x86, without the need for negative levels: 0 being the lower and 3 highest. Higher levels tend to be created more often than lower ones.

The current EL can be queried with the MRS instruction: what is the current execution mode/exception level, etc?

ARM does not require all exception levels to be present to allow for implementations that don't need the feature to save chip area. ARMv8 "Exception levels" says:

An implementation might not include all of the Exception levels. All implementations must include EL0 and EL1.
EL2 and EL3 are optional.

QEMU for example defaults to EL1, but EL2 and EL3 can be enabled with command line options: qemu-system-aarch64 entering el1 when emulating a53 power up

Code snippets tested on Ubuntu 18.10.

回复收藏 0 原文

静待花开 2024-11-13 07:31:37

内核空间和虚拟空间是虚拟内存的概念......这并不意味着Ram（您的实际内存）分为内核和内存。用户空间。
每个进程都有一个虚拟内存，分为内核内存和内核内存。用户空间。

这么说
“随机存取存储器（RAM）可以分为两个不同的区域，即内核空间和用户空间。”是错误的。

&关于“内核空间与用户空间”的事情

当一个进程被创建时，它的虚拟内存分为用户空间和内核空间，其中用户空间区域包含进程的数据、代码、堆栈、堆和内存。内核空间包含进程的页表、内核数据结构和内核代码等。
要运行内核空间代码，控制必须转移到内核模式（使用 0x80 软件中断进行系统调用）和内核堆栈基本上由当前在内核空间中执行的所有进程共享。

回复收藏 0 原文

捂风挽笑 2024-11-13 07:31:37

内核空间和用户空间是特权操作系统功能和受限用户应用程序的分离。为了防止用户应用程序洗劫您的计算机，这种分离是必要的。如果任何旧用户程序可以开始将随机数据写入硬盘或从另一个用户程序的内存空间读取内存，那将是一件坏事。

用户空间程序无法直接访问系统资源，因此访问由操作系统内核代表程序处理。用户空间程序通常通过系统调用向操作系统发出此类请求。

内核线程、进程、堆栈并不是同一件事。它们是内核空间中与用户空间中的对应结构类似的结构。

回复收藏 0 原文

无远思近则忧 2024-11-13 07:31:37

每个进程都有自己的4GB虚拟内存，通过页表映射到物理内存。虚拟内存主要分为两部分：3 GB 用于进程使用，1 GB 用于内核使用。您创建的大多数变量位于地址空间的第一部分。该部分称为用户空间。最后一部分是内核所在的位置，对所有进程都是通用的。这称为内核空间，大部分空间被映射到物理内存的起始位置，在启动时加载内核映像。

回复收藏 0 原文

む无字情书 2024-11-13 07:31:37

地址空间的最大大小取决于CPU上地址寄存器的长度。

在具有 32 位地址寄存器的系统上，地址空间的最大大小为 2³² 字节，即 4GiB。
同样，在 64 位系统上，可以寻址 2⁶⁴ 字节。

这样的地址空间称为虚拟内存或虚拟地址空间。它实际上与物理 RAM 大小无关。

在Linux平台上，虚拟地址空间分为内核空间和用户空间。

称为任务大小限制或TASK_SIZE的特定于体系结构的常量，标记发生分割的位置：

从0到TASK_SIZE的地址范围< /code>-1 分配给用户空间；
TASK_SIZE 到 2³²-1（或 2⁶⁴-1）的剩余部分分配给内核空间。

例如，在特定的 32 位系统上，用户空间可能占用 3 GiB，内核空间可能占用 1 GiB。

类 Unix 操作系统中的每个应用程序/程序都是一个进程；其中每一个都有一个唯一的标识符，称为进程标识符（或简称“进程 ID”，即 PID）。 Linux 提供两种创建进程的机制：1. fork() 系统调用，或 2. exec() 调用。

内核线程是一个轻量级的进程，也是一个正在执行的程序。
单个进程可能由共享相同数据和资源但在程序代码中采用不同路径的多个线程组成。 Linux 提供了clone() 系统调用来生成线程。

内核线程的示例用途包括：RAM 的数据同步、帮助调度程序在 CPU 之间分配进程等。

回复收藏 0 原文

燕归巢 2024-11-13 07:31:37

简而言之：内核运行在内核空间中，内核空间可以完全访问所有内存和资源，可以说内存分为两部分，一部分供内核使用，一部分供用户自己的进程使用，（用户空间）运行普通程序，用户space不能直接访问内核空间，因此它向内核请求使用资源。通过 syscall（glibc 中预定义的系统调用）

有一个声明可以简化不同的“用户空间只是内核的测试负载”...

要非常清楚：处理器架构允许CPU在两种模式下运行，内核模式和用户模式，硬件指令允许从一种模式切换到另一种模式。

内存可以被标记为用户空间或内核空间的一部分。

当CPU运行在用户模式时，CPU只能访问用户空间中的内存，而CPU尝试访问内核空间中的内存，结果是“硬件异常”，当CPU运行在内核模式时，CPU可以直接访问内核空间和用户空间...

回复收藏 0 原文

话少情深 2024-11-13 07:31:37

内核空间和用户空间是逻辑空间。

大多数现代处理器都设计为在不同的特权模式下运行。 x86 机器可以在 4 种不同的特权模式下运行。

并且当处于/高于特定特权模式时可以执行特定机器指令。

由于这种设计，您可以为执行环境提供系统保护或沙箱。

内核是一段代码，它管理您的硬件并提供系统抽象。因此它需要访问所有机器指令。它是最值得信赖的软件。所以我应该以最高特权被处决。 Ring level 0 是最特权的模式。因此Ring Level 0也称为内核模式。

用户应用程序是来自任何第三方供应商的软件，您不能完全信任他们。如果有恶意的人可以完全访问所有机器指令，他就可以编写代码来使您的系统崩溃。因此，应为应用程序提供对有限指令集的访问权限。 Ring Level 3 是最低特权模式。因此您的所有应用程序都在该模式下运行。因此，环级别 3 也称为用户模式。

注意：我没有获得环级别 1 和 2。它们基本上是具有中级权限的模式。因此设备驱动程序代码可能是使用此权限执行的。 AFAIK，Linux 仅使用 Ring Level 0 和 3 分别用于内核代码执行和用户应用程序。

因此，发生在内核模式下的任何操作都可以被视为内核空间。
任何发生在用户态的操作都可以被认为是用户空间。

回复收藏 0 原文

等风来 2024-11-13 07:31:37

内核空间是指只能由内核访问的内存空间。在32位linux上它是1G（从0xC0000000到0xffffffff作为虚拟内存地址）。内核创建的每个进程也是一个内核线程，因此对于一个进程来说，有两个堆栈：一个堆栈位于用户空间，另一个堆栈位于内核内核线程的空间。

内核堆栈占用2页（32位linux中为8k），包括task_struct（约1k）和真实堆栈（约7k）。后者用于存储一些自动变量或函数调用参数或内核函数中的函数地址。这是代码(Processor.h (linux\include\asm-i386))：

#define THREAD_SIZE (2*PAGE_SIZE)
#define alloc_task_struct() ((struct task_struct *) __get_free_pages(GFP_KERNEL,1))
#define free_task_struct(p) free_pages((unsigned long) (p), 1)

__get_free_pages(GFP_KERNEL,1)) 表示分配内存为 2^1=2 页。

但进程堆栈是另一回事，它的地址就在0xC0000000(32位linux)以下，它的大小可以相当大，用于用户空间函数调用。

那么这里有一个关于系统调用的问题，它运行在内核空间，但被用户空间的进程调用，它是如何工作的？ linux会将其参数和函数地址放入内核堆栈还是进程堆栈中？ Linux的解决方案：所有系统调用均由软件中断INT 0x80触发。
定义在entry.S(linux\arch\i386\kernel)中，下面是一些行，例如：

ENTRY(sys_call_table)
.long SYMBOL_NAME(sys_ni_syscall)   /* 0  -  old "setup()" system call*/
.long SYMBOL_NAME(sys_exit)
.long SYMBOL_NAME(sys_fork)
.long SYMBOL_NAME(sys_read)
.long SYMBOL_NAME(sys_write)
.long SYMBOL_NAME(sys_open)     /* 5 */
.long SYMBOL_NAME(sys_close)

The kernel space means a memory space can only be touched by kernel. On 32bit linux it is 1G(from 0xC0000000 to 0xffffffff as virtual memory address).Every process created by kernel is also a kernel thread, So for one process, there are two stacks: one stack in user space for this process and another in kernel space for kernel thread.

the kernel stack occupied 2 pages(8k in 32bit linux), include a task_struct(about 1k) and the real stack(about 7k). The latter is used to store some auto variables or function call params or function address in kernel functions. Here is the code(Processor.h (linux\include\asm-i386)):

#define THREAD_SIZE (2*PAGE_SIZE)
#define alloc_task_struct() ((struct task_struct *) __get_free_pages(GFP_KERNEL,1))
#define free_task_struct(p) free_pages((unsigned long) (p), 1)

__get_free_pages(GFP_KERNEL,1)) means alloc memory as 2^1=2 pages.

But the process stack is another thing, its address is just bellow 0xC0000000(32bit linux), the size of it can be quite bigger, used for the user space function calls.

So here is a question come for system call, it is running in kernel space but was called by process in user space, how does it work? Will linux put its params and function address in kernel stack or process stack? Linux's solution: all system call are triggered by software interruption INT 0x80.
Defined in entry.S (linux\arch\i386\kernel), here is some lines for example:

ENTRY(sys_call_table)
.long SYMBOL_NAME(sys_ni_syscall)   /* 0  -  old "setup()" system call*/
.long SYMBOL_NAME(sys_exit)
.long SYMBOL_NAME(sys_fork)
.long SYMBOL_NAME(sys_read)
.long SYMBOL_NAME(sys_write)
.long SYMBOL_NAME(sys_open)     /* 5 */
.long SYMBOL_NAME(sys_close)

回复收藏 0 原文

孤凫 2024-11-13 07:31:37

作者：

Linux 内核是指在内核模式下运行的所有内容，并且是
由几个不同的层组成。在最底层，内核
通过 HAL 与硬件交互。在中间层，
UNIX 内核分为 4 个不同的区域。四个中的第一个
区域处理字符设备、原始和熟的 TTY 和终端
处理。第二个区域处理网络设备驱动程序、路由
协议和套接字。第三个区域处理磁盘设备驱动程序，
页面和缓冲区高速缓存、文件系统、虚拟内存、文件命名和
映射。第四个也是最后一个区域处理进程调度，
调度、创建和终止以及信号处理。
最重要的是，我们有内核的顶层，其中包括
系统调用、中断和陷阱。该级别用作
每个较低级别功能的接口。程序员使用
与功能交互的各种系统调用和中断
操作系统的。

回复收藏 0 原文

彻夜缠绵 2024-11-13 07:31:37

简而言之，内核空间是 Linux 内核运行的内存部分（对于 Linux 来说是顶部 1 GB 虚拟空间），用户空间是用户应用程序运行的内存部分（对于 Linux 来说是底部 3 GB 虚拟内存。如果您想了解更多信息，请参阅下面给出的链接:)

http://learnlinuxconcepts.blogspot.in/2014/02/kernel-space-and-user-space.html

回复收藏 0 原文

巴黎盛开的樱花 2024-11-13 07:31:37

试图给出一个非常简单的解释

虚拟内存分为内核空间和用户空间。
内核空间是内核进程将运行的虚拟内存区域，用户空间是用户进程将运行的虚拟内存区域。

这种划分是内存访问保护所必需的。

每当引导加载程序在将内核加载到 RAM 中的某个位置后启动内核时（通常在基于 ARM 的控制器上），它需要确保控制器处于禁用 FIQ 和 IRQ 的管理模式。

回复收藏 0 原文

千柳 2024-11-13 07:31:37

正确答案是：不存在内核空间和用户空间这样的东西。处理器指令集具有特殊权限来设置破坏性的东西，例如页表映射的根，或访问硬件设备内存等。

内核代码具有最高级别的权限，用户代码具有最低级别的权限。这可以防止用户代码使系统崩溃、修改其他程序等。

通常，内核代码与用户代码保存在不同的内存映射中（就像用户空间保存在彼此不同的内存映射中一样）。这就是“内核空间”和“用户空间”术语的由来。但这并不是一条硬性规定。例如，由于 x86 间接要求始终映射其中断/陷阱处理程序，因此必须将内核的一部分（或某些操作系统全部）映射到用户空间。再次强调，这并不意味着此类代码具有用户权限。

为什么需要内核/用户划分？一些设计师不同意这实际上是必要的。微内核架构基于这样的思想：最高特权的代码部分应该尽可能小，所有重要的操作都在用户特权代码中完成。您需要研究为什么这可能是一个好主意，它不是一个简单的概念（并且以优点和缺点而闻名）。

回复收藏 0 原文

-柠檬树下少年和吉他 2024-11-13 07:31:37

这种划分需要架构支持，有一些指令是在特权模式下访问的。

在页表中，我们有访问详细信息，如果用户进程尝试访问位于内核地址范围内的地址，那么它将给出特权违规错误。

因此，要进入特权模式，需要运行 trap 等指令，将 CPU 模式更改为特权模式，并授予对指令和内存区域的访问权限

回复收藏 0 原文

岁月流歌 2024-11-13 07:31:37

在Linux中有两个空间，第一个是用户空间，另一个是内核空间。用户空间仅包含您要运行的用户应用程序。作为内核服务，有进程管理、文件管理、信号处理、内存管理、线程管理等许多服务。如果您从用户空间运行应用程序，则该应用程序仅与内核服务交互。该服务与硬件和内核之间存在的设备驱动程序进行交互。
内核空间和用户空间分离的主要好处是我们可以通过存在于用户空间中的所有用户应用程序的virus.bcaz来实现安全性，并且服务存在于内核空间中。这就是为什么linux不受病毒影响的原因。

回复收藏 0 原文

~没有更多了~

关于作者

无远思近则忧

暂无简介

0 文章

0 评论

23 人气

关注发私信

相关话题

热门标签

操作系统程序设计 IT运维 Linux系统管理 JavaScript 服务器应用 solaris C/C++ PHP Shell BSD Vue.js aix Oracle Python HTML 系统管理 HTML5 CSS 前端

推荐作者

烙印

文章 0 评论 0

singlesman

文章 0 评论 0

给自己一个微笑

文章 0 评论 0

独孤求败

文章 0 评论 0

晨钟暮鼓

文章 0 评论 0

我是自愿种绣球花的

文章 0 评论 0

友情链接

我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的隐私政策了解更多相关信息。单击 接受 或继续使用网站，即表示您同意使用 Cookies 和您的相关数据。

原文