什么是陷阱?

发布于 2024-12-11 09:41:38 字数 113 浏览 0 评论 0原文

处理器数据表中列出了许多不同类型的陷阱,例如总线故障、MemManage 故障、使用故障和地址错误。

他们的目的是什么?如何将它们用于故障处理?

There are many different types of traps listed in processor datasheets, e.g. BusFault, MemManage Fault, Usage Fault and Address Error.

What is their purpose? How can they be utilized in fault handling?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

风蛊 2024-12-18 09:41:38

陷阱本质上是处理器在指令流中检测到异常情况时强制执行的子例程调用。 (有些处理器使它们进入中断,但这主要只是将更多上下文推入堆栈;如果陷阱包括用户和系统地址空间之间的切换,这会变得更有趣)。

这对于处理很少发生但需要解决的情况非常有用,例如除以零。通常,在执行除法指令之前使用一对额外的指令来测试除数是否为零是无用的开销,因为除数永远不会为零。因此,架构师让处理器作为除法指令的一部分与实际除法并行执行此检查,并在除数为零时导致处理器陷入除零例程。另一个有趣的情况是非法内存地址;显然,您不想在使用每个地址之前编写测试代码来检查它。

通常存在多种可能感兴趣的故障条件,并且处理器在设计上会将控制传递给针对每种不同类型的故障的不同陷阱例程(通常设置为向量)。

一旦处理器拥有陷阱功能,CPU 架构师就会发现很多用途。常见用途是调试器断点和陷阱到操作系统,以执行操作系统调用。

Traps are essentially subroutine calls that are forced by the processor when it detects something unusual in your stream of instructions. (Some processors make them into interrupts, but that's mostly just pushing more context onto the stack; this gets more interesting if the trap includes a switch between user and system address spaces).

This is useful for handling conditions that occur rarely but need to be addressed, such as division by zero. Normally, it is useless overhead to have an extra pair of instructions to test the divisor for zero before executing a divide instruction, since the divisor is never expected to be zero. So the architects have the processor do this check in parallel with the actual divide as part of the divide instruction, and cause the processor to trap to a divide-by-zero routine if the divisor is zero. Another interesting case is illegal-memory-address; clearly, you don't want to have to code a test to check each address before you use it.

Often there are a variety of fault conditions of potential interest and the processor by design will pass control to a different trap routine (often set up as a vector) for each different type of fault.

Once the processor has a trap facility, the CPU architects find lots of uses. A common use is debugger breakpoints, and trap-to-OS for the purposes of executing a operating system call.

谈情不如逗狗 2024-12-18 09:41:38

微处理器具有针对各种故障情况的陷阱。它们是同步中断,允许运行的操作系统/软件对错误采取适当的操作。陷阱中断程序流并设置寄存器位以指示故障。调试器断点也可以使用陷阱来实现。

在典型的计算环境中,操作系统负责处理由用户进程触发的 CPU 陷阱。让我们考虑一下当我运行以下程序时会发生什么:

int main(void)
{
    volatile int a = 1, b = 0;
    a = a % b; /* div by zero */
    return 0;
}

显示一条错误消息,而我的盒子仍在运行,就像什么都没发生一样。在这种情况下,我的操作系统处理故障的方法是终止有问题的进程,并用错误消息浮点异常通知用户。

内核模式下的陷阱问题较多。如果操作系统本身有问题,那么采取纠正措施就不是那么简单了。对于系统进程来说,没有底层保护。这就是为什么有缺陷的设备驱动程序会导致真正的问题。

当在裸机上工作时,没有操作系统的舒适保护,情况与上述情况非常相似。实现连续正确操作的首要目标是使用断言和更高级别的错误处理程序在触发任何陷阱之前捕获所有潜在的陷阱条件将陷阱视为最后一道防线,一个您不想掉入的安全网。

为陷阱处理程序定义行为值得一些思考,即使它们 < em>“永远不应该发生”。当事情以意外的方式出现问题时(最极端的情况是由于宇宙射线改变 RAM),它们将被执行。不幸的是,对于错误处理程序应该做什么没有单一的正确答案。

代码完成,第二版:

最合适的错误处理方式取决于错误发生的软件类型,并且通常更倾向于正确性或更鲁棒性。严格来说,这些术语彼此处于两端。正确性意味着永远不会返回不准确的结果;没有结果比不准确的结果更好。稳健性意味着始终尝试做一些让软件继续运行的事情,即使这有时会导致结果不准确。

显然,我的操作系统的故障处理在设计时就考虑到了鲁棒性;我可以执行有缺陷的代码并执行几乎任何操作,而不会导致系统崩溃。仅仅为了稳健性而设计意味着尽可能尝试恢复,如果其他方法都失败,则重置。如果您的产品是玩具,那么这是一种合适的方法。

安全关键型应用程序需要多一点偏执,而应该偏向正确性;当检测到故障时,写入错误日志,关机。我们不希望我们的放射治疗单位从无效的垃圾值中选择剂量水平。

Microprocessors have traps for various fault conditions. They are synchronous interrupts that allow the running OS / software to take appropriate action on the error. Traps interrupt program flow and set register bits to indicate the fault. Debugger breakpoints are also implemented using traps.

In a typical computing environment, the operating system takes care of CPU traps triggered by user processes. Let's consider what happens when I run the following program:

int main(void)
{
    volatile int a = 1, b = 0;
    a = a % b; /* div by zero */
    return 0;
}

An error message was displayed, and my box is still running like nothing happened. My operating system's approach to fault handling in this case was to kill the offending process and inform the user with the error message Floating point exception.

Traps in kernel mode are more problematic. It is not as strightforward for the OS to take corrective action if it is itself at fault. For a system process there is no underlying layer of protection. This is why faulty device drivers can cause real problems.

When working on bare metal, without the comforting protection of an operating system, the situation is much similar to the one above. Number one objective for achieving continuous and correct operation is to catch all potential trap conditions before they get to trigger any traps, using assertions and higher-level error handlers. Consider traps as the last line of defense, a safety net you don't intentionally want to fall into.

Defining behaviors for trap handlers is worth some thought, even if they "should never happen". They will be executed when things go wrong in an unanticipated manner, be it due to cosmic rays altering RAM in the most extreme case. Unfortunately, there is no single correct answer to what error handlers should do.

Code Complete, 2nd ed:

The style of error processing that is most appropriate depends on the kind of software the error occurs in and generally favors more correctness or more robustness. Strictly speaking, these terms are at opposite ends of the scale from each other. Correctness means never returning an inaccurate result; no result is better than an inaccurate result. Robustness means always trying to do something that will allow the software to keep operating, even if that leads to results that are inaccurate sometimes.

Clearly, my operating system's fault handling is designed with robustness in mind; I can execute flawed code and do pretty much anything without crashing the system. Designing solely for robustness would mean a recovery attempt whenever possible, and if all else fails, reset. This is a suitable approach if your product is e.g. a toy.

Safety critical applications need a bit more paranoia and should favor correctness instead; when a fault is detected, write error log, shutdown. We don't want our radiation therapy unit to pick dosage levels from invalid garbage values.

梦行七里 2024-12-18 09:41:38

ARMv7-M(不要与 ARM7 或 ARMv7-A 混淆)Cortex-M3 技术参考手册,也可能是新 ARM ARM(ARM 架构参考手册)之一的一部分
有一节描述了其中的每一个故障。

现在,“为什么”与“是什么”也许是问题的根源。原因通常是为了让您有机会康复。想象一下您的机顶盒或电话遇到其中一个问题,您希望它挂起还是如果可能的话尝试恢复?除非您预计会出现这些故障之一(在这种情况下您不应该如此,x86 系统及其某些故障是完全不同的情况),如果您存活足够长的时间而遇到其中一个故障,您很可能最终会扣动扳机你自己(该软件试图通过重置处理器/系统来自杀)。您可以浏览长长的列表并尝试找到可以从中恢复的列表。除以零,异常处理程序如何知道导致此结果的数学错误是什么?一般来说是不行的。未对齐的加载或存储,处理程序如何知道代码试图做什么,比如除以零,这可能是一个软件错误。未定义的指令,代码进入杂草并执行数据,此时您很可能已经走得太远并且无法恢复。任何类型的内存总线故障处理程序都无法修复硬件。

你必须经历每一个错误,并为每一个错误定义你将如何处理它,你可能遇到那个错误的所有方法,以及你可以摆脱或处理每一条路径的方法。有时您可能能够恢复,否则您需要默认操作,例如将处理器挂在处理程序中的无限循环中,以便软件工程师(如果可用)可以尝试使用调试器进入并找到代码停止了。或者有一个看门狗定时器,位于芯片内部或外部,具体取决于芯片和电路板设计(通常在芯片外部,WDT 会重置整个电路板)。您可能有一些非易失性存储器,您尝试在允许或导致重置之前将故障存储在其中,执行此操作所需的时间和代码可能会导致您出现另一个故障,具体取决于故障原因。

The ARMv7-M (not to be confused with the ARM7 nor the ARMv7-A) Cortex-M3 technical reference manual, which may also be part of one of the new ARM ARMs (ARM Architectural Reference Manual)
has a section describing each one of these faults.

Now the whys versus the whats are perhaps at the root of the question. The why is usually so you have a chance to recover. Imagine your set-top box or telephone that hits one of these do you want it to hang or if possible try to recover? Unless you are expecting one of these faults (which in this context you shouldnt be, x86 systems and some of their faults are a completely different story) if you survive long enough to hit one of these you would most likely end up pulling the trigger on yourself (the software trying to kill itself by resetting the processor/system). You can go through the long list and try to find ones you can recover from. Divide by zero, how is the exception handler to know what the math mistake is that lead to this? In general it cant. Unaligned load or store, how is the handler to know what that code was trying to do, like divide by zero it is probably a software bug. Undefined instruction, the code went into the weeds and executed data most likely by this point you are already too far gone and couldnt recover. Any kind of memory bus fault the handler cannot repair the hardware.

You have to go through every fault, and for each fault define how you are going to handle it, all the ways you could have gotten to that one fault and the ways you can get out or handle each one of those paths. On occasion you might be able to recover, otherwise you need a default action, hang the processor in an infinite loop in the handler for example so that the software engineer, if available, can try to use a debugger to get in and find where the code stopped. Or have a watchdog timer, inside or outside the chip depending on the chip and board design (often outside the chip the WDT will reset the whole board). You might have some non-volatile memory that you attempt to store the fault in, before letting or causing the reset, the time and code it takes to do that might lead you to another fault depending on what is failing.

友谊不毕业 2024-12-18 09:41:38

简而言之,它们允许您在处理器中发生某些情况时执行代码。它们有时被操作系统用于错误恢复。

Simply put, they allow you to execute code when something happens in the processor. They're sometimes used by the OS for error recovery.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文