内核空间和用户空间有什么区别?
内核空间和用户空间有什么区别?内核空间、内核线程、内核进程和内核堆栈意思相同吗?另外,为什么我们需要这种差异化?
What is the difference between the kernel space and the user space? Do kernel space, kernel threads, kernel processes and kernel stack mean the same thing? Also, why do we need this differentiation?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(16)
真正简单的答案是内核在内核空间中运行,普通程序在用户空间中运行。用户空间基本上是沙箱的一种形式——它限制用户程序,使它们不能干扰其他程序或操作系统内核拥有的内存(和其他资源)。这限制了(但通常不会完全消除)他们做坏事的能力,例如使机器崩溃。
内核是操作系统的核心。它通常可以完全访问所有内存和机器硬件(以及机器上的其他所有内容)。为了使机器尽可能稳定,您通常只希望最值得信赖、经过充分测试的代码在内核模式/内核空间中运行。
堆栈只是内存的另一部分,因此它自然与内存的其余部分一起隔离。
The really simplified answer is that the kernel runs in kernel space, and normal programs run in user space. User space is basically a form of sand-boxing -- it restricts user programs so they can't mess with memory (and other resources) owned by other programs or by the OS kernel. This limits (but usually doesn't entirely eliminate) their ability to do bad things like crashing the machine.
The kernel is the core of the operating system. It normally has full access to all memory and machine hardware (and everything else on the machine). To keep the machine as stable as possible, you normally want only the most trusted, well-tested code to run in kernel mode/kernel space.
The stack is just another part of memory, so naturally it's segregated right along with the rest of memory.
随机存取存储器(RAM)在逻辑上可以分为两个不同的区域,即内核空间和用户空间。(RAM的物理地址实际上并未划分为虚拟地址,所有这些都是由 MMU 实现的)
内核在有权使用它的内存部分中运行。这部分内存不能被普通用户的进程直接访问,而内核可以访问这部分内存。要访问内核的某些部分,用户进程必须使用预定义的系统调用,即
open
、read
、write
等。printf
等C
库函数依次调用系统调用write
。系统调用充当用户进程和内核进程之间的接口。访问权限被放置在内核空间上,以防止用户在不知情的情况下干扰内核。
因此,当系统调用发生时,软件中断被发送到内核。 CPU 可以暂时将控制权移交给相关的中断处理程序。中断处理程序完成其工作后,由中断暂停的内核进程将恢复。
The Random Access Memory (RAM) can be logically divided into two distinct regions namely - the kernel space and the user space.(The Physical Addresses of the RAM are not actually divided only the Virtual Addresses, all this implemented by the MMU)
The kernel runs in the part of memory entitled to it. This part of memory cannot be accessed directly by the processes of the normal users, while the kernel can access all parts of the memory. To access some part of the kernel, the user processes have to use the predefined system calls i.e.
open
,read
,write
etc. Also, theC
library functions likeprintf
call the system callwrite
in turn.The system calls act as an interface between the user processes and the kernel processes. The access rights are placed on the kernel space in order to stop the users from messing with the kernel unknowingly.
So, when a system call occurs, a software interrupt is sent to the kernel. The CPU may hand over the control temporarily to the associated interrupt handler routine. The kernel process which was halted by the interrupt resumes after the interrupt handler routine finishes its job.
CPU环是最明显的区别
在x86保护模式下,CPU始终处于4个环之一。 Linux 内核只使用 0 和 3:
供用户使用,这是内核与用户态最硬、最快速的定义。
为什么 Linux 不使用环 1 和 2:CPU 权限环:为什么不使用环 1 和 2?
当前环是如何确定的?
当前环是通过以下组合选择的:
全局描述符表:a内存中的 GDT 条目表,每个条目都有一个对环进行编码的字段
Privl
。LGDT指令将地址设置为当前描述符表。
另请参阅:http://wiki.osdev.org/Global_Descriptor_Table
段寄存器 CS、DS等等,它们指向 GDT 中条目的索引。
例如,
CS = 0
表示 GDT 的第一个条目当前对于执行代码处于活动状态。每个环可以做什么?
CPU 芯片的物理构建使得:
环 0 可以做任何事情
环 3 不能运行多个指令并写入到几个寄存器,最值得注意的是:
无法改变自己的戒指!否则,它可以将自己设置为ring 0,而rings将毫无用处。
换句话说,无法修改当前的段描述符,它决定当前的环。< /p>
无法修改页表:x86 分页如何工作?
换句话说,无法修改CR3寄存器,并且分页本身会阻止页表的修改。
出于安全性/易于编程的原因,这可以防止一个进程看到其他进程的内存。
无法注册中断处理程序。这些是通过写入内存位置来配置的,这也可以通过分页来防止。
处理程序在环 0 中运行,会破坏安全模型。
也就是说,不能使用LGDT和LIDT指令。
无法执行
in
和out
等IO指令,因此可以进行任意硬件访问。否则,例如,如果任何程序可以直接从磁盘读取,文件权限将毫无用处。
更准确地说,感谢Michael Petch:操作系统实际上有可能允许环上的 IO 指令3、这实际上是由任务状态段控制的。
环 3 不可能授予自己这样做的许可(如果它一开始就没有这样做的话)。
Linux 总是不允许它。另请参阅: 为什么 Linux 不使用通过 TSS 进行硬件上下文切换?
程序和操作系统如何在环之间转换?
当 CPU 打开时,它开始运行环 0 中的初始程序(很好)的,但它是一个很好的近似)。您可以认为这个初始程序是内核(但它通常是 然后调用仍在环 0 中的内核的引导加载程序)。
当用户态进程希望内核为其执行某些操作(例如写入文件)时,它会使用生成中断的指令,例如
int 0x80
或syscall
向内核发出信号。 x86-64 Linux 系统调用 hello world 示例:<前><代码>.data
你好世界:
.ascii“你好世界\n”
你好世界长度 = . - 你好世界
。文本
.global_start
_开始:
/* 写 */
移动 $1,%rax
移动 $1,%rdi
移动 $hello_world, %rsi
移动 $hello_world_len, %rdx
系统调用
/* 出口 */
移动 $60, %rax
移动$0,%rdi
系统调用
编译并运行:
GitHub 上游.
发生这种情况时,CPU 会调用内核在启动时注册的中断回调处理程序。这是一个具体的裸机示例,用于注册处理程序并使用它.
,它决定内核是否允许此操作、执行该操作并在环 3.x86_64 中重新启动用户态程序
当使用
exec
系统调用时(或当内核启动时/init
),内核准备好新用户态进程的寄存器和内存,然后跳转到入口点并将CPU切换为ring 3如果程序尝试做一些顽皮的事情,例如写入禁止寄存器或内存地址(因为分页),CPU 还会在环 0 中调用一些内核回调处理程序。
但是由于用户态很顽皮,内核这次可能会杀死进程,或者用信号给它一个警告。
当内核启动时,它会设置一个具有固定频率的硬件时钟,该时钟会定期生成中断。
该硬件时钟生成运行环 0 的中断,并允许它安排唤醒哪些用户态进程。
这样,即使进程没有进行任何系统调用,也可以进行调度。
拥有多个环有什么意义?
分离内核和用户空间有两个主要优点:
如何使用它?
我创建了一个裸机设置,这应该是直接操作环的好方法:https://github.com/cirosantilli/x86-bare-metal-examples
不幸的是,我没有耐心制作用户态示例,但我做到了尽可能进行分页设置,所以用户态应该是可行的。我很想看到拉取请求。
或者,Linux 内核模块在环 0 中运行,因此您可以使用它们来尝试特权操作,例如读取控制寄存器:如何访问控制寄存器来自程序的 cr0、cr2、cr3?获取分段错误
这是一个方便的方法QEMU + Buildroot 设置 可以在不杀死主机的情况下进行尝试。
内核模块的缺点是其他 kthread 正在运行,可能会干扰您的实验。但理论上你可以用你的内核模块接管所有中断处理程序并拥有系统,这实际上是一个有趣的项目。
负环
虽然英特尔手册中实际上并未提及负环,但实际上存在比环 0 本身具有更多功能的 CPU 模式,因此非常适合“负环”名称。
一个例子是虚拟化中使用的管理程序模式。
有关更多详细信息,请参阅:
ARM
在 ARM 中,环被称为异常级别,但主要思想保持不变。
ARMv8 中存在 4 个异常级别,常用为:
EL0:用户态
EL1:内核(ARM 术语中的“主管”)。
使用
svc
指令(SuperVisor Call)输入,以前称为swi
统一汇编之前,这是用来制作的指令Linux 系统调用。你好世界ARMv8示例:你好。
<前><代码>.文本
.global_start
_开始:
/* 写 */
移动 x0, 1
ldr x1, =消息
ldr x2,=len
移动 x8, 64
服务0
/* 出口 */
移动 x0, 0
移动 x8, 93
服务0
消息:
.ascii“你好系统调用 v8\n”
长度 = . - 味精
GitHub 上游。< /p>
在 Ubuntu 16.04 上使用 QEMU 进行测试:
这是一个具体的裸机示例,注册一个 SVC 处理程序并执行 SVC 调用。
EL2:虚拟机管理程序,例如Xen。
使用
hvc
指令(HyperVisor 调用)输入。虚拟机管理程序对于操作系统来说,就像操作系统对于用户空间一样。
例如,Xen 允许您在同一系统上同时运行多个操作系统,例如 Linux 或 Windows,并且它将操作系统彼此隔离以确保安全性和易于调试,就像 Linux 对用户态程序所做的那样。
虚拟机管理程序是当今云基础设施的关键部分:它们允许多个服务器在单个硬件上运行,使硬件使用率始终接近 100%,并节省大量资金。
例如,AWS 在 2017 年之前一直使用 Xen,当时 其迁移到 KVM 成为新闻.
EL3:另一个级别。 TODO 示例。
通过
smc
指令(安全模式调用)输入ARMv8 架构参考模型 DDI 0487C.a - 章节D1 - AArch64 系统级程序员模型 - 图 D1-1 精美地说明了这一点:
随着 ARMv8.1 虚拟化主机扩展 (VHE)。此扩展允许内核在 EL2 中高效运行:
VHE 的创建是因为 Linux 内核虚拟化解决方案(例如 KVM)已经超越 Xen(参见上面提到的 AWS 转向 KVM),因为大多数客户只需要Linux VM,正如您可以想象的那样,KVM 都在一个项目中,因此比 Xen 更简单且可能更高效。因此,现在主机 Linux 内核在这些情况下充当虚拟机管理程序。
请注意,也许是出于事后诸葛亮的考虑,ARM 对权限级别的命名约定比 x86 更好,而不需要负级别:0 表示较低,3 表示最高。较高级别往往比较低级别更容易创建。
可以使用
MRS
指令查询当前的EL:当前执行模式/异常级别是什么?ARM 不要求所有异常级别都存在以允许不需要该功能来节省芯片面积的实现。 ARMv8“异常级别”说:
例如,QEMU 默认为 EL1,但可以使用命令行选项启用 EL2 和 EL3:qemu-system-aarch64 在模拟 a53 power up 时进入 el1
在 Ubuntu 上测试的代码片段18.10。
CPU rings are the most clear distinction
In x86 protected mode, the CPU is always in one of 4 rings. The Linux kernel only uses 0 and 3:
This is the most hard and fast definition of kernel vs userland.
Why Linux does not use rings 1 and 2: CPU Privilege Rings: Why rings 1 and 2 aren't used?
How is the current ring determined?
The current ring is selected by a combination of:
global descriptor table: a in-memory table of GDT entries, and each entry has a field
Privl
which encodes the ring.The LGDT instruction sets the address to the current descriptor table.
See also: http://wiki.osdev.org/Global_Descriptor_Table
the segment registers CS, DS, etc., which point to the index of an entry in the GDT.
For example,
CS = 0
means the first entry of the GDT is currently active for the executing code.What can each ring do?
The CPU chip is physically built so that:
ring 0 can do anything
ring 3 cannot run several instructions and write to several registers, most notably:
cannot change its own ring! Otherwise, it could set itself to ring 0 and rings would be useless.
In other words, cannot modify the current segment descriptor, which determines the current ring.
cannot modify the page tables: How does x86 paging work?
In other words, cannot modify the CR3 register, and paging itself prevents modification of the page tables.
This prevents one process from seeing the memory of other processes for security / ease of programming reasons.
cannot register interrupt handlers. Those are configured by writing to memory locations, which is also prevented by paging.
Handlers run in ring 0, and would break the security model.
In other words, cannot use the LGDT and LIDT instructions.
cannot do IO instructions like
in
andout
, and thus have arbitrary hardware accesses.Otherwise, for example, file permissions would be useless if any program could directly read from disk.
More precisely thanks to Michael Petch: it is actually possible for the OS to allow IO instructions on ring 3, this is actually controlled by the Task state segment.
What is not possible is for ring 3 to give itself permission to do so if it didn't have it in the first place.
Linux always disallows it. See also: Why doesn't Linux use the hardware context switch via the TSS?
How do programs and operating systems transition between rings?
when the CPU is turned on, it starts running the initial program in ring 0 (well kind of, but it is a good approximation). You can think this initial program as being the kernel (but it is normally a bootloader that then calls the kernel still in ring 0).
when a userland process wants the kernel to do something for it like write to a file, it uses an instruction that generates an interrupt such as
int 0x80
orsyscall
to signal the kernel. x86-64 Linux syscall hello world example:compile and run:
GitHub upstream.
When this happens, the CPU calls an interrupt callback handler which the kernel registered at boot time. Here is a concrete baremetal example that registers a handler and uses it.
This handler runs in ring 0, which decides if the kernel will allow this action, do the action, and restart the userland program in ring 3. x86_64
when the
exec
system call is used (or when the kernel will start/init
), the kernel prepares the registers and memory of the new userland process, then it jumps to the entry point and switches the CPU to ring 3If the program tries to do something naughty like write to a forbidden register or memory address (because of paging), the CPU also calls some kernel callback handler in ring 0.
But since the userland was naughty, the kernel might kill the process this time, or give it a warning with a signal.
When the kernel boots, it setups a hardware clock with some fixed frequency, which generates interrupts periodically.
This hardware clock generates interrupts that run ring 0, and allow it to schedule which userland processes to wake up.
This way, scheduling can happen even if the processes are not making any system calls.
What is the point of having multiple rings?
There are two major advantages of separating kernel and userland:
How to play around with it?
I've created a bare metal setup that should be a good way to manipulate rings directly: https://github.com/cirosantilli/x86-bare-metal-examples
I didn't have the patience to make a userland example unfortunately, but I did go as far as paging setup, so userland should be feasible. I'd love to see a pull request.
Alternatively, Linux kernel modules run in ring 0, so you can use them to try out privileged operations, e.g. read the control registers: How to access the control registers cr0,cr2,cr3 from a program? Getting segmentation fault
Here is a convenient QEMU + Buildroot setup to try it out without killing your host.
The downside of kernel modules is that other kthreads are running and could interfere with your experiments. But in theory you can take over all interrupt handlers with your kernel module and own the system, that would be an interesting project actually.
Negative rings
While negative rings are not actually referenced in the Intel manual, there are actually CPU modes which have further capabilities than ring 0 itself, and so are a good fit for the "negative ring" name.
One example is the hypervisor mode used in virtualization.
For further details see:
ARM
In ARM, the rings are called Exception Levels instead, but the main ideas remain the same.
There exist 4 exception levels in ARMv8, commonly used as:
EL0: userland
EL1: kernel ("supervisor" in ARM terminology).
Entered with the
svc
instruction (SuperVisor Call), previously known asswi
before unified assembly, which is the instruction used to make Linux system calls. Hello world ARMv8 example:hello.S
GitHub upstream.
Test it out with QEMU on Ubuntu 16.04:
Here is a concrete baremetal example that registers an SVC handler and does an SVC call.
EL2: hypervisors, for example Xen.
Entered with the
hvc
instruction (HyperVisor Call).A hypervisor is to an OS, what an OS is to userland.
For example, Xen allows you to run multiple OSes such as Linux or Windows on the same system at the same time, and it isolates the OSes from one another for security and ease of debug, just like Linux does for userland programs.
Hypervisors are a key part of today's cloud infrastructure: they allow multiple servers to run on a single hardware, keeping hardware usage always close to 100% and saving a lot of money.
AWS for example used Xen until 2017 when its move to KVM made the news.
EL3: yet another level. TODO example.
Entered with the
smc
instruction (Secure Mode Call)The ARMv8 Architecture Reference Model DDI 0487C.a - Chapter D1 - The AArch64 System Level Programmer's Model - Figure D1-1 illustrates this beautifully:
The ARM situation changed a bit with the advent of ARMv8.1 Virtualization Host Extensions (VHE). This extension allows the kernel to run in EL2 efficiently:
VHE was created because in-Linux-kernel virtualization solutions such as KVM have gained ground over Xen (see e.g. AWS' move to KVM mentioned above), because most clients only need Linux VMs, and as you can imagine, being all in a single project, KVM is simpler and potentially more efficient than Xen. So now the host Linux kernel acts as the hypervisor in those cases.
Note how ARM, maybe due to the benefit of hindsight, has a better naming convention for the privilege levels than x86, without the need for negative levels: 0 being the lower and 3 highest. Higher levels tend to be created more often than lower ones.
The current EL can be queried with the
MRS
instruction: what is the current execution mode/exception level, etc?ARM does not require all exception levels to be present to allow for implementations that don't need the feature to save chip area. ARMv8 "Exception levels" says:
QEMU for example defaults to EL1, but EL2 and EL3 can be enabled with command line options: qemu-system-aarch64 entering el1 when emulating a53 power up
Code snippets tested on Ubuntu 18.10.
内核空间和虚拟空间是虚拟内存的概念......这并不意味着Ram(您的实际内存)分为内核和内存。用户空间。
每个进程都有一个虚拟内存,分为内核内存和内核内存。用户空间。
这么说
“随机存取存储器(RAM)可以分为两个不同的区域,即内核空间和用户空间。”是错误的。
&关于“内核空间与用户空间”的事情
当一个进程被创建时,它的虚拟内存分为用户空间和内核空间,其中用户空间区域包含进程的数据、代码、堆栈、堆和内存。内核空间包含进程的页表、内核数据结构和内核代码等。
要运行内核空间代码,控制必须转移到内核模式(使用 0x80 软件中断进行系统调用)和内核堆栈基本上由当前在内核空间中执行的所有进程共享。
Kernel space & virtual space are concepts of virtual memory....it doesn't mean Ram(your actual memory) is divided into kernel & User space.
Each process is given virtual memory which is divided into kernel & user space.
So saying
"The random access memory (RAM) can be divided into two distinct regions namely - the kernel space and the user space." is wrong.
& regarding "kernel space vs user space" thing
When a process is created and its virtual memory is divided into user-space and a kernel-space , where user space region contains data, code, stack, heap of the process & kernel-space contains things such as the page table for the process, kernel data structures and kernel code etc.
To run kernel space code, control must shift to kernel mode(using 0x80 software interrupt for system calls) & kernel stack is basically shared among all processes currently executing in kernel space.
内核空间和用户空间是特权操作系统功能和受限用户应用程序的分离。为了防止用户应用程序洗劫您的计算机,这种分离是必要的。如果任何旧用户程序可以开始将随机数据写入硬盘或从另一个用户程序的内存空间读取内存,那将是一件坏事。
用户空间程序无法直接访问系统资源,因此访问由操作系统内核代表程序处理。用户空间程序通常通过系统调用向操作系统发出此类请求。
内核线程、进程、堆栈并不是同一件事。它们是内核空间中与用户空间中的对应结构类似的结构。
Kernel space and user space is the separation of the privileged operating system functions and the restricted user applications. The separation is necessary to prevent user applications from ransacking your computer. It would be a bad thing if any old user program could start writing random data to your hard drive or read memory from another user program's memory space.
User space programs cannot access system resources directly so access is handled on the program's behalf by the operating system kernel. The user space programs typically make such requests of the operating system through system calls.
Kernel threads, processes, stack do not mean the same thing. They are analogous constructs for kernel space as their counterparts in user space.
每个进程都有自己的4GB虚拟内存,通过页表映射到物理内存。虚拟内存主要分为两部分:3 GB 用于进程使用,1 GB 用于内核使用。您创建的大多数变量位于地址空间的第一部分。该部分称为用户空间。最后一部分是内核所在的位置,对所有进程都是通用的。这称为内核空间,大部分空间被映射到物理内存的起始位置,在启动时加载内核映像。
Each process has its own 4GB of virtual memory which maps to the physical memory through page tables. The virtual memory is mostly split in two parts: 3 GB for the use of the process and 1 GB for the use of the Kernel. Most of the variables you create lie in the first part of the address space. That part is called user space. The last part is where the kernel resides and is common for all the processes. This is called Kernel space and most of this space is mapped to the starting locations of physical memory where the kernel image is loaded at boot time.
地址空间的最大大小取决于CPU上地址寄存器的长度。
在具有 32 位地址寄存器的系统上,地址空间的最大大小为 232 字节,即 4GiB。
同样,在 64 位系统上,可以寻址 264 字节。
这样的地址空间称为虚拟内存或虚拟地址空间。它实际上与物理 RAM 大小无关。
在Linux平台上,虚拟地址空间分为内核空间和用户空间。
称为任务大小限制或
TASK_SIZE
的特定于体系结构的常量,标记发生分割的位置:从0到
TASK_SIZE的地址范围< /code>-1 分配给用户空间;
TASK_SIZE
到 232-1(或 264-1)的剩余部分分配给内核空间。例如,在特定的 32 位系统上,用户空间可能占用 3 GiB,内核空间可能占用 1 GiB。
类 Unix 操作系统中的每个应用程序/程序都是一个进程;其中每一个都有一个唯一的标识符,称为进程标识符(或简称“进程 ID”,即 PID)。 Linux 提供两种创建进程的机制:1.
fork()
系统调用,或 2.exec()
调用。内核线程是一个轻量级的进程,也是一个正在执行的程序。
单个进程可能由共享相同数据和资源但在程序代码中采用不同路径的多个线程组成。 Linux 提供了
clone()
系统调用来生成线程。内核线程的示例用途包括:RAM 的数据同步、帮助调度程序在 CPU 之间分配进程等。
The maximum size of address space depends on the length of the address register on the CPU.
On systems with 32-bit address registers, the maximum size of address space is 232 bytes, or 4 GiB.
Similarly, on 64-bit systems, 264 bytes can be addressed.
Such address space is called virtual memory or virtual address space. It is not actually related to physical RAM size.
On Linux platforms, virtual address space is divided into kernel space and user space.
An architecture-specific constant called task size limit, or
TASK_SIZE
, marks the position where the split occurs:the address range from 0 up to
TASK_SIZE
-1 is allotted to user space;the remainder from
TASK_SIZE
up to 232-1 (or 264-1) is allotted to kernel space.On a particular 32-bit system for example, 3 GiB could be occupied for user space and 1 GiB for kernel space.
Each application/program in a Unix-like operating system is a process; each of those has a unique identifier called Process Identifier (or simply Process ID, i.e. PID). Linux provides two mechanisms for creating a process: 1. the
fork()
system call, or 2. theexec()
call.A kernel thread is a lightweight process and also a program under execution.
A single process may consist of several threads sharing the same data and resources but taking different paths through the program code. Linux provides a
clone()
system call to generate threads.Example uses of kernel threads are: data synchronization of RAM, helping the scheduler to distribute processes among CPUs, etc.
简而言之:内核运行在内核空间中,内核空间可以完全访问所有内存和资源,可以说内存分为两部分,一部分供内核使用,一部分供用户自己的进程使用,(用户空间)运行普通程序,用户space不能直接访问内核空间,因此它向内核请求使用资源。通过 syscall(glibc 中预定义的系统调用)
有一个声明可以简化不同的“用户空间只是内核的测试负载”...
要非常清楚:处理器架构允许CPU在两种模式下运行,内核模式和用户模式,硬件指令允许从一种模式切换到另一种模式。
内存可以被标记为用户空间或内核空间的一部分。
当CPU运行在用户模式时,CPU只能访问用户空间中的内存,而CPU尝试访问内核空间中的内存,结果是“硬件异常”,当CPU运行在内核模式时,CPU可以直接访问内核空间和用户空间...
Briefly : Kernel runs in Kernel Space, the kernel space has full access to all memory and resources, you can say the memory divide into two parts, part for kernel , and part for user own process, (user space) runs normal programs, user space cannot access directly to kernel space so it request from kernel to use resources. by syscall (predefined system call in glibc)
there is a statement that simplify the different "User Space is Just a test load for the Kernel " ...
To be very clear : processor architecture allow CPU to operate in two mode, Kernel Mode and User Mode, the Hardware instruction allow switching from one mode to the other.
memory can be marked as being part of user space or kernel space.
When CPU running in User Mode, the CPU can access only memory that is being in user space, while cpu attempts to access memory in Kernel space the result is a "hardware exception", when CPU running in Kernel mode, the CPU can access directly to both kernel space and user space ...
内核空间和用户空间是逻辑空间。
大多数现代处理器都设计为在不同的特权模式下运行。 x86 机器可以在 4 种不同的特权模式下运行。
并且当处于/高于特定特权模式时可以执行特定机器指令。
由于这种设计,您可以为执行环境提供系统保护或沙箱。
内核是一段代码,它管理您的硬件并提供系统抽象。因此它需要访问所有机器指令。它是最值得信赖的软件。所以我应该以最高特权被处决。 Ring level 0 是最特权的模式。因此Ring Level 0也称为内核模式。
用户应用程序是来自任何第三方供应商的软件,您不能完全信任他们。如果有恶意的人可以完全访问所有机器指令,他就可以编写代码来使您的系统崩溃。因此,应为应用程序提供对有限指令集的访问权限。 Ring Level 3 是最低特权模式。因此您的所有应用程序都在该模式下运行。因此,环级别 3 也称为用户模式。
注意:我没有获得环级别 1 和 2。它们基本上是具有中级权限的模式。因此设备驱动程序代码可能是使用此权限执行的。 AFAIK,Linux 仅使用 Ring Level 0 和 3 分别用于内核代码执行和用户应用程序。
因此,发生在内核模式下的任何操作都可以被视为内核空间。
任何发生在用户态的操作都可以被认为是用户空间。
Kernel Space and User Space are logical spaces.
Most of the modern processors are designed to run in different privileged mode. x86 machines can run in 4 different privileged modes.
And a particular machine instruction can be executed when in/above particular privileged mode.
Because of this design you are giving a system protection or sand-boxing the execution environment.
Kernel is a piece of code, which manages your hardware and provide system abstraction. So it needs to have access for all the machine instruction. And it is most trusted piece of software. So i should be executed with the highest privilege. And Ring level 0 is the most privileged mode. So Ring Level 0 is also called as Kernel Mode.
User Application are piece of software which comes from any third party vendor, and you can't completely trust them. Someone with malicious intent can write a code to crash your system if he had complete access to all the machine instruction. So application should be provided with access to limited set of instructions. And Ring Level 3 is the least privileged mode. So all your application run in that mode. Hence that Ring Level 3 is also called User Mode.
Note: I am not getting Ring Levels 1 and 2. They are basically modes with intermediate privilege. So may be device driver code are executed with this privilege. AFAIK, linux uses only Ring Level 0 and 3 for kernel code execution and user application respectively.
So any operation happening in kernel mode can be considered as kernel space.
And any operation happening in user mode can be considered as user space.
内核空间是指只能由内核访问的内存空间。在32位linux上它是1G(从0xC0000000到0xffffffff作为虚拟内存地址)。内核创建的每个进程也是一个内核线程,因此对于一个进程来说,有两个堆栈:一个堆栈位于用户空间,另一个堆栈位于内核内核线程的空间。
内核堆栈占用2页(32位linux中为8k),包括task_struct(约1k)和真实堆栈(约7k)。后者用于存储一些自动变量或函数调用参数或内核函数中的函数地址。这是代码(Processor.h (linux\include\asm-i386)):
__get_free_pages(GFP_KERNEL,1)) 表示分配内存为 2^1=2 页。
但进程堆栈是另一回事,它的地址就在0xC0000000(32位linux)以下,它的大小可以相当大,用于用户空间函数调用。
那么这里有一个关于系统调用的问题,它运行在内核空间,但被用户空间的进程调用,它是如何工作的? linux会将其参数和函数地址放入内核堆栈还是进程堆栈中? Linux的解决方案:所有系统调用均由软件中断INT 0x80触发。
定义在entry.S(linux\arch\i386\kernel)中,下面是一些行,例如:
The kernel space means a memory space can only be touched by kernel. On 32bit linux it is 1G(from 0xC0000000 to 0xffffffff as virtual memory address).Every process created by kernel is also a kernel thread, So for one process, there are two stacks: one stack in user space for this process and another in kernel space for kernel thread.
the kernel stack occupied 2 pages(8k in 32bit linux), include a task_struct(about 1k) and the real stack(about 7k). The latter is used to store some auto variables or function call params or function address in kernel functions. Here is the code(Processor.h (linux\include\asm-i386)):
__get_free_pages(GFP_KERNEL,1)) means alloc memory as 2^1=2 pages.
But the process stack is another thing, its address is just bellow 0xC0000000(32bit linux), the size of it can be quite bigger, used for the user space function calls.
So here is a question come for system call, it is running in kernel space but was called by process in user space, how does it work? Will linux put its params and function address in kernel stack or process stack? Linux's solution: all system call are triggered by software interruption INT 0x80.
Defined in entry.S (linux\arch\i386\kernel), here is some lines for example:
作者:
By Sunil Yadav, on Quora:
简而言之,内核空间是 Linux 内核运行的内存部分(对于 Linux 来说是顶部 1 GB 虚拟空间),用户空间是用户应用程序运行的内存部分(对于 Linux 来说是底部 3 GB 虚拟内存。如果您想了解更多信息,请参阅下面给出的链接:)
http://learnlinuxconcepts.blogspot.in/2014/02/kernel-space-and-user-space.html
IN short kernel space is the portion of memory where linux kernel runs (top 1 GB virtual space in case of linux) and user space is the portion of memory where user application runs( bottom 3 GB of virtual memory in case of Linux. If you wanna know more the see the link given below :)
http://learnlinuxconcepts.blogspot.in/2014/02/kernel-space-and-user-space.html
试图给出一个非常简单的解释
虚拟内存分为内核空间和用户空间。
内核空间是内核进程将运行的虚拟内存区域,用户空间是用户进程将运行的虚拟内存区域。
这种划分是内存访问保护所必需的。
每当引导加载程序在将内核加载到 RAM 中的某个位置后启动内核时(通常在基于 ARM 的控制器上),它需要确保控制器处于禁用 FIQ 和 IRQ 的管理模式。
Trying to give a very simplified explanation
Virtual Memory is divided into kernel space and the user space.
Kernel space is that area of virtual memory where kernel processes will run and user space is that area of virtual memory where user processes will be running.
This division is required for memory access protections.
Whenever a bootloader starts a kernel after loading it to a location in RAM, (on an ARM based controller typically)it needs to make sure that the controller is in supervisor mode with FIQ's and IRQ's disabled.
正确答案是:不存在内核空间和用户空间这样的东西。处理器指令集具有特殊权限来设置破坏性的东西,例如页表映射的根,或访问硬件设备内存等。
内核代码具有最高级别的权限,用户代码具有最低级别的权限。这可以防止用户代码使系统崩溃、修改其他程序等。
通常,内核代码与用户代码保存在不同的内存映射中(就像用户空间保存在彼此不同的内存映射中一样)。这就是“内核空间”和“用户空间”术语的由来。但这并不是一条硬性规定。例如,由于 x86 间接要求始终映射其中断/陷阱处理程序,因此必须将内核的一部分(或某些操作系统全部)映射到用户空间。再次强调,这并不意味着此类代码具有用户权限。
为什么需要内核/用户划分?一些设计师不同意这实际上是必要的。微内核架构基于这样的思想:最高特权的代码部分应该尽可能小,所有重要的操作都在用户特权代码中完成。您需要研究为什么这可能是一个好主意,它不是一个简单的概念(并且以优点和缺点而闻名)。
The correct answer is: There is no such thing as kernel space and user space. The processor instruction set has special permissions to set destructive things like the root of the page table map, or access hardware device memory, etc.
Kernel code has the highest level privileges, and user code the lowest. This prevents user code from crashing the system, modifying other programs, etc.
Generally kernel code is kept under a different memory map than user code (just as user spaces are kept in different memory maps than each other). This is where the "kernel space" and "user space" terms come from. But that is not a hard and fast rule. For example, since the x86 indirectly requires its interrupt/trap handlers to be mapped at all times, part (or some OSes all) of the kernel must be mapped into user space. Again, this does not mean that such code has user privileges.
Why is the kernel/user divide necessary? Some designers disagree that it is, in fact, necessary. Microkernel architecture is based on the idea that the highest privileged sections of code should be as small as possible, with all significant operations done in user privileged code. You would need to study why this might be a good idea, it is not a simple concept (and is famous for both having advantages and drawbacks).
这种划分需要架构支持,有一些指令是在特权模式下访问的。
在页表中,我们有访问详细信息,如果用户进程尝试访问位于内核地址范围内的地址,那么它将给出特权违规错误。
因此,要进入特权模式,需要运行 trap 等指令,将 CPU 模式更改为特权模式,并授予对指令和内存区域的访问权限
This demarcation need architecture support there are some instructions that are accessed in privileged mode.
In pagetables we have access details if user process try to access address which lies in kernel address range then it will give privilege violation fault.
So to enter privileged mode it is required to run instruction like trap which change CPU mode to privilege and give access to instructions as well as memory regions
在Linux中有两个空间,第一个是用户空间,另一个是内核空间。用户空间仅包含您要运行的用户应用程序。作为内核服务,有进程管理、文件管理、信号处理、内存管理、线程管理等许多服务。如果您从用户空间运行应用程序,则该应用程序仅与内核服务交互。该服务与硬件和内核之间存在的设备驱动程序进行交互。
内核空间和用户空间分离的主要好处是我们可以通过存在于用户空间中的所有用户应用程序的virus.bcaz来实现安全性,并且服务存在于内核空间中。这就是为什么linux不受病毒影响的原因。
In Linux there are two space 1st is user space and another one is kernal space. user space consist of only user application which u want to run. as the kernal service there is process management, file management, signal handling, memory management, thread management, and so many services are present there. if u run the application from the user space that appliction interact with only kernal service. and that service is interact with device driver which is present between hardware and kernal.
the main benefit of kernal space and user space seperation is we can acchive a security by the virus.bcaz of all user application present in user space, and service is present in kernal space. thats why linux doesn,t affect from the virus.