为什么嵌入式系统上的寄存器需要读-修改-写?
我正在阅读 http://embeddedgurus.com /embedded-bridge/2010/03/ different-bit-types-in- Different-registers/,其中表示:
通过读/写位,固件可以在需要时设置和清除位。它通常首先读取寄存器,修改所需的位,然后将修改后的值写回
,我遇到了这种构造,同时维护了一些由老盐嵌入式人员编码的生产代码。我不明白为什么这是必要的。
当我想设置/清除一点时,我总是用位掩码进行或/与非操作。在我看来,这解决了任何线程安全问题,因为我假设设置(通过赋值或使用掩码进行或运算)寄存器只需要一个周期。另一方面,如果先读取寄存器,然后修改,然后写入,则读取和写入之间发生的中断可能会导致将旧值写入寄存器。
那么为什么要读-修改-写呢?还有必要吗?
I was reading http://embeddedgurus.com/embedded-bridge/2010/03/different-bit-types-in-different-registers/, which said:
With read/write bits, firmware sets and clears bits when needed. It typically first reads the register, modifies the desired bit, then writes the modified value back out
and I have run into that consrtuct while maintaining some production code coded by old salt embedded guys here. I don't understand why this is necessary.
When I want to set/clear a bit, I always just or/nand with a bitmask. To my mind, this solves any threadsafe problems, since I assume setting (either by assignment or oring with a mask) a register only takes one cycle. On the other hand, if you first read the register, then modify, then write, an interrupt happening between the read and write may result in writing an old value to the register.
So why read-modify-write? Is it still necessary?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这在一定程度上取决于您的特定嵌入式设备的体系结构。我将给出三个涵盖常见情况的示例。然而,其基本要点是,从根本上来说,CPU 内核不能直接对 I/O 设备的寄存器进行操作,除非以字节甚至字的方式读写它们。
1)68HC08系列,8位独立微控制器。
这包括“位设置”和“位清除”指令。如果您仔细阅读手册,这些实际上会在内部自行执行读取-修改-写入循环。它们确实具有原子操作的优点,因为作为单个指令,它们不能被中断。
您还会注意到,它们比单独的读取或写入指令花费的时间更长,但比使用三个指令执行任务所需的时间更少(见下文)。
2) ARM 或 PowerPC,传统的 32 位 RISC CPU(也常见于高端微控制器)。
这些不包括可以同时访问内存和执行计算(和/或)的任何指令。如果您用 C 编写:
*register |= 0x40;
,它会变成以下程序集(对于此 PowerPC 示例,r8 包含寄存器地址):
因为这是多条指令,所以它不是原子的,并且它可以被中断。使其原子化甚至 SMP 安全超出了本答案的范围 - 有专门的说明和技术。
3) IA32 (x86) 和 AMD64。为什么你会使用它们来“嵌入”,我无法理解,但它们是其他两个示例之间的折衷方案。
我忘记了 x86 上是否有单指令内存位设置和位清除。如果没有,请参阅上面的 RISC 部分,它只需要两条指令而不是三条,因为 x86 可以在一条指令中加载和修改。
假设有这样的指令,它们也需要在内部加载和存储寄存器以及修改它。现代版本将在内部明确地将指令分解为三个类似 RISC 的操作。
奇怪的是,x86(与 HC08 不同)可以在事务处理过程中通过总线主机在内存总线上中断,而不仅仅是通过传统的 CPU 中断。因此,您可以手动为需要执行多个内存周期才能完成的指令添加 LOCK 前缀,如本例所示。但你不会从普通的 C 中得到这个。
This depends somewhat on the architecture of your particular embedded device. I'll give three examples that cover the common cases. The basic gist of it, however, is that fundamentally the CPU core cannot operate directly on the I/O devices' registers, except to read and write them in a byte- or even word-wise fashion.
1) 68HC08 series, an 8-bit self-contained microcontroller.
This includes a "bit set" and a "bit clear" instruction. These, if you read the manual carefully, actually internally perform a read-modify-write cycle by themselves. They do have the advantage of being atomic operations, since as single instructions they cannot be interrupted.
You will also notice that they take longer than individual read or write instructions, but less time than using three instructions for the job (see below).
2) ARM or PowerPC, conventional 32-bit RISC CPUs (often found in high-end microcontrollers too).
These do not include any instructions which can both access memory and perform a computation (the and/or) at once. If you write in C:
*register |= 0x40;
it turns into the folowing assembly (for this PowerPC example, r8 contains the register address):
Because this is multiple instructions, it is NOT atomic, and it can be interrupted. Making it atomic or even SMP-safe is beyond the scope of this answer - there are special instructions and techniques for it.
3) IA32 (x86) and AMD64. Why you would use these for "embedded" is beyond me, but they are a half-way house between the other two examples.
I forget whether there is a single-instruction in-memory bit-set and bit-clear on x86. If not, then see the RISC section above, it just takes only two instructions instead of three because x86 can load and modify in one instruction.
Assuming there are such instructions, they also need to internally load and store the register as well as modifying it. Modern versions will explcitly break the instruction into the three RISC-like operations internally.
The oddity is that x86 (unlike the HC08) can be interrupted on the memory bus in mid-transaction by a bus master, not just by a conventional CPU interrupt. So you can manually add a LOCK prefix to an instruction that needs to do multiple memory cycles to complete, as in this case. You won't get this from plain C though.
问题是,如果您不想修改寄存器中的其他位,则在写入内容之前必须知道它们是什么。因此是读/修改/写。请注意,如果您使用类似以下的 C 语句:
尽管乍一看这可能看起来像一个简单的写入操作,但编译器必须首先执行读取操作,以便保留值中的其他位(这通常是正确的,即使您不是在谈论硬件寄存器,除非编译器能够使用有关该值的其他知识来优化读取)。
请注意,内存映射的硬件寄存器通常被专门标记为“易失性”,以便无法进行这些优化(否则许多硬件寄存器例程将无法正常工作)。
最后,有时有对寄存器的硬件支持,专门设置或清除硬件中的位,而不需要读取/修改/写入序列。我使用过的一些 Atmel ARM 微控制器具有特定的寄存器,可以清除或设置硬件中的位,仅那些在写入寄存器时设置的位(保留任何未设置的位)。此外,Cortex M3 ARM CPU 支持通过使用他们称为 '位带'。位带算法乍一看很复杂,但它实际上只是一些简单的算术,用于将一个地址中的位偏移量映射到另一个“位特定”地址。
不管怎样,最重要的是,有些处理器不需要读/修改/写系列就可以逃脱,但这绝不是普遍正确的。
The thing is if you don't want to modify the other bits in the register you have to know what they are before you write something it. Hence the read/modiy/write. Note that if you use a C statement like:
Event though that might look like a simple write operation at first glace, the compiler must perform a read first in order to preserve the other bits in the value (this is generally true, even if you're not talking about hardware registers, unless the compiler is able to use other knowledge about the value to optimize the read away).
Note that memory-mapped hardware registers are generally marked
volatile
specifically so that these optimizations cannot take place (otherwise many hardware register routines wouldn't work properly).Finally, sometimes there is hardware support for registers that specifically set or clear bits in the hardware without requiring a read/modify/write sequence. Some Atmel ARM microcontrollers I've worked with have this with specific registers that clear or set bits in hardware only those bits that are set when you write to the register (leaving any unset bit alone). Also the Cortex M3 ARM CPU supports accessing a single bit (for read or write) in memory or in hardware registers this through accessing a specific address space with a technique they call 'bit-banding'. The bit-banding algorithm looks complex at first glance, but it's really just some simple arithmetic to map the offset of a bit in one address to another 'bit-specific' address.
Anyway, the bottom line is that there are some processors where you can get away without a read/modify/write series, but that's by no means universally true.
现代处理器可以使用单指令设置或清除位。然而,这些指令不能同时置位和清除。在某些情况下,IO 端口的某些位必须全部一起更改且不影响其他位。只要读-修改-写的顺序不能被破坏,就没有问题。
rmw 可能成为问题的情况需要三个条件。
变量必须是全局可访问的,例如 IO 端口或特殊功能寄存器或全局定义的变量。
全局变量可以在可抢占的函数中修改。
在服务抢占时修改相同的全局变量。
使用 rmw 非原子序列解决多位修改的唯一方法是通过禁用也可以修改变量或寄存器的中断服务例程的中断来保护指令序列。这类似于 digine 对 LCD 或串行端口等资源的独占访问。
Modern processors can either set or clear bits with single instruction. However, these instructions can not both set and clear at the same time. There are instances when some of bits of an IO port must all change together and not affect other bits. As long as the sequence of read-modify-write can not be broken, there is no problem.
The situation where the r-m-w can become a problem requires three conditions.
The variable must be globally accessible such as an IO port or special function register or globally defined variable.
The global variable can be modified in a function that can be preempted.
The same global variable is modified while servicing a preemption.
The only way to resolve multiple bit modifications using a r-m-w non-atomic sequence is to protect the sequence of instructions by disabling the interrupt for the interrupt service routine that can also modify the variable or register. This is similar to digine exclusive access to resources such as LCD or serial ports.
如果您必须修改字中的部分位,并且该体系结构仅支持字级读/写,则您必须读取不得更改的位才能知道要写回哪些内容,以便它们不会被修改。
某些架构支持全局或特定内存区域的位级内存访问。但即便如此,当修改多个位时,多次读取-修改-写入也会导致指令数量减少。在多线程系统中,必须注意确保两个线程不能同时对同一个字执行此非原子操作。
If you have to modify a subset of the bits in a word, and the architecture only supports word level read/write, you have to read the bits that must not change to know what to write back so that they are not modified.
Some architectures support bit level memory access either globally or for specific regions of memory. But even then when modifying multiple bits, read-modify-write many result in fewer instructions. In multi-threaded systems care must be taken to ensure that two threads cannot perform this non-atomic action on the same word concurrently.
对于某些寄存器来说这已经足够了。在这种情况下,CPU 的固件仍然会执行读取-修改-写入操作。
如果让 CPU 的固件为您执行读取-修改-写入操作,显然它将至少包括一个读取周期和一个写入周期。现在,大多数 CPU 不会在中间中断该指令,因此您的线程将在线程的 CPU 检查中断之前执行整个指令,但如果您尚未锁定总线,则其他 CPU 可以修改同一寄存器。您的线程和其他线程仍然可以相互遍历。
For some registers that's good enough. In such a case the CPU's firmware is still going to do a read-modify-write.
If you let the CPU's firmware do the read-modify-write for you, obviously it's going to include at least a read cycle and a write cycle. Now, most CPUs won't interrupt this instruction in the middle, so your thread will execute this entire instruction before your thread's CPU checks for interrupts, but if you haven't locked the bus then other CPUs can modify the same register. Your thread and other threads can still walk all over each other.