总线错误与分段错误

发布于 2024-07-18 08:58:08 字数 56 浏览 13 评论 0原文

总线错误和分段错误之间的区别? 程序第一次会出现段错误并停止,第二次可能会出现总线错误并退出吗?

Difference between a bus error and a segmentation fault?
Can it happen that a program gives a seg fault and stops for the first time and for the second time it may give a bus error and exit ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

蘸点软妹酱 2024-07-25 08:58:08

在我使用过的大多数架构上,区别在于:

  • SEGV 是在您访问不该访问的内存时引起的(例如,在地址空间之外)。
  • SIGBUS 是由于 CPU 的对齐问题引起的(例如,尝试从不是 4 的倍数的地址读取 long)。

On most architectures I've used, the distinction is that:

  • a SEGV is caused when you access memory you're not meant to (e.g., outside of your address space).
  • a SIGBUS is caused due to alignment issues with the CPU (e.g., trying to read a long from an address which isn't a multiple of 4).
你与昨日 2024-07-25 08:58:08

如果您mmap()SIGBUS > 文件并尝试访问超出文件末尾的映射缓冲区部分,以及空间不足等错误情况。 如果您使用 sigaction() 注册信号处理程序,并且您设置SA_SIGINFO,可以让您的程序检查故障内存地址并仅处理内存映射文件错误。

SIGBUS will also be raised if you mmap() a file and attempt to access part of the mapped buffer that extends past the end of the file, as well as for error conditions such as out of space. If you register a signal handler using sigaction() and you set SA_SIGINFO, it may be possible to have your program examine the faulting memory address and handle only memory mapped file errors.

毁虫ゝ 2024-07-25 08:58:08

例如,当您的程序尝试执行硬件总线不支持的操作时,可能会导致总线错误。 例如,在 SPARC 上,尝试读取多字节值(例如 int、32 -位)来自奇数地址产生总线错误。

例如,当您执行违反分段规则的访问时,即尝试读取或写入不属于您的内存时,就会发生分段错误。

For instance, a bus error might be caused when your program tries to do something that the hardware bus doesn't support. On SPARCs, for instance, trying to read a multi-byte value (such as an int, 32-bits) from an odd address generated a bus error.

Segmentation faults happen for instance when you do an access that violate the segmentation rules, i.e. trying to read or write memory that you don't own.

我早已燃尽 2024-07-25 08:58:08

将您的问题(可能是错误的)解释为“我间歇性地收到 SIGSEGV 或 SIGBUS,为什么它不一致?”,值得注意的是,C 或 C++ 标准并不能保证使用指针做一些狡猾的事情会导致段错误; 这只是“未定义的行为”,正如我曾经说过的,这意味着它可能会导致鳄鱼从地板上出现并吃掉你。

因此,您的情况可能是有两个错误,第一个错误有时会导致 SIGSEGV,第二个错误(如果段错误没有发生并且程序仍在运行)会导致 SIGBUS。

我建议您使用调试器逐步执行,并留意鳄鱼。

Interpreting your question (possibly incorrectly) as meaning "I am intermittently getting a SIGSEGV or a SIGBUS, why isn't it consistent?", it's worth noting that doing dodgy things with pointers is not guaranteed by the C or C++ standards to result in a segfault; it's just "undefined behaviour", which as a professor I had once put it means that it may instead cause crocodiles to emerge from the floorboards and eat you.

So your situation could be that you have two bugs, where the first to occur sometimes causes SIGSEGV, and the second (if the segfault didn't happen and the program is still running) causes a SIGBUS.

I recommend you step through with a debugger, and look out for crocodiles.

酒与心事 2024-07-25 08:58:08

我假设您正在谈论 Posix 定义的 SIGSEGVSIGBUS 信号。

当程序引用无效地址时,会发生SIGSEGVSIGBUS 是实现定义的硬件故障。 这两个信号的默认操作是终止程序。

程序可以捕获这些信号,甚至忽略它们。

I assume you're talking about the SIGSEGV and SIGBUS signals defined by Posix.

SIGSEGV occurs when the program references an invalid address. SIGBUS is an implementation-defined hardware fault. The default action for these two signals is to terminate the program.

The program can catch these signals, and even ignore them.

等待我真够勒 2024-07-25 08:58:08

有没有可能程序第一次出现段错误并停止,第二次可能出现总线错误并退出?

是的,即使是同一个错误:这是来自 macOS 的一个严肃但简单的示例,它可以通过数组边界之外的索引以确定性方式产生分段错误 (SIGSEGV) 和总线错误 (SIGBUS)。 上面提到的未对齐访问对于 macOS 来说不是问题。 (这个示例不会导致任何 SIGBUS,如果它在调试器中运行,在我的例子中是 lldb!)

bus_segv.c:

#include <stdlib.h>

char array[10];

int main(int argc, char *argv[]) {
    return array[atol(argv[1])];
}

该示例从命令行获取一个整数,该整数用作数组。 这些是一些不会引起任何信号的索引值(甚至在数组之外)。 (给出的所有值取决于标准段/节大小。我使用 clang-902.0.39.1 在 High Sierra macOS 10.13.5、i5-4288U CPU @ 2.60GHz 上生成二进制文件。)

索引高于 77791 且低于 -4128将导致分段错误 (SIGSEGV)。 24544 将导致总线错误 (SIGBUS)。 这里是完整的地图:

$ ./bus_segv -4129
Segmentation fault: 11
$ ./bus_segv -4128
...
$ ./bus_segv 24543
$ ./bus_segv 24544
Bus error: 10
...
$ ./bus_segv 28639
Bus error: 10
$ ./bus_segv 28640
...
$ ./bus_segv 45023
$ ./bus_segv 45024
Bus error: 10
...
$ ./bus_segv 53215
Bus error: 10
$ ./bus_segv 53216
...
$ ./bus_segv 69599
$ ./bus_segv 69600
Bus error: 10
...
$ ./bus_segv 73695
Bus error: 10
$ ./bus_segv 73696
...
$ ./bus_segv 77791
$ ./bus_segv 77792
Segmentation fault: 11

如果您查看反汇编代码,您会发现总线错误范围的边界并不像索引显示的那么奇怪:

$ otool -tv bus_segv

bus_segv:
(__TEXT,__text) section
_main:
0000000100000f60    pushq   %rbp
0000000100000f61    movq    %rsp, %rbp
0000000100000f64    subq    $0x10, %rsp
0000000100000f68    movl    $0x0, -0x4(%rbp)
0000000100000f6f    movl    %edi, -0x8(%rbp)
0000000100000f72    movq    %rsi, -0x10(%rbp)
0000000100000f76    movq    -0x10(%rbp), %rsi
0000000100000f7a    movq    0x8(%rsi), %rdi
0000000100000f7e    callq   0x100000f94 ## symbol stub for: _atol
0000000100000f83    leaq    0x96(%rip), %rsi
0000000100000f8a    movsbl  (%rsi,%rax), %eax
0000000100000f8e    addq    $0x10, %rsp
0000000100000f92    popq    %rbp    
0000000100000f93    retq    

By leaq 0x96(%rip), % rsi,rsi 变为(PC 相对
已确定)数组起始地址的地址:

rsi = 0x100000f8a + 0x96 = 0x100001020
rsi - 4128 = 0x100000000 (below segmentation fault)
rsi + 24544 = 0x100007000 (here and above bus error)
rsi + 28640 = 0x100008000 (below bus error)
rsi + 45024 = 0x10000c000 (here and above bus error)
rsi + 53216 = 0x10000e000 (below bus error)
rsi + 69600 = 0x100012000 (here and above bus error)
rsi + 73696 = 0x100013000 (below bus error)
rsi + 77792 = 0x100014000 (here and above segmentation fault)

lldb 可能会设置具有不同页限制的进程。 我无法在调试会话中重现任何总线错误。 因此,调试器可能是总线错误吐出二进制文件的解决方法。

安德烈亚斯

Can it happen that a program gives a seg fault and stops for the first time and for the second time it may give a bus error and exit?

Yes, even for one and the same bug: Here is a serious but simplistic example from macOS that can produce both, segmentation fault (SIGSEGV) and bus error (SIGBUS), by indexes outside the boundaries of an array, in a deterministic way. The unaligned access mentioned above is not an issue with macOS. (This example will not cause any SIGBUS, if it runs inside a debugger, lldb in my case!)

bus_segv.c:

#include <stdlib.h>

char array[10];

int main(int argc, char *argv[]) {
    return array[atol(argv[1])];
}

The example takes an integer from the command-line, which serves as the index for the array. The are some index values (even outside the array) that will not cause any signal. (All values given depend on the standard segment/section sizes. I used clang-902.0.39.1 to produce the binary on a High Sierra macOS 10.13.5, i5-4288U CPU @ 2.60GHz.)

An index above 77791 and below -4128 will cause a segmentation fault (SIGSEGV). 24544 will cause a Bus error (SIGBUS). Here the complete map:

$ ./bus_segv -4129
Segmentation fault: 11
$ ./bus_segv -4128
...
$ ./bus_segv 24543
$ ./bus_segv 24544
Bus error: 10
...
$ ./bus_segv 28639
Bus error: 10
$ ./bus_segv 28640
...
$ ./bus_segv 45023
$ ./bus_segv 45024
Bus error: 10
...
$ ./bus_segv 53215
Bus error: 10
$ ./bus_segv 53216
...
$ ./bus_segv 69599
$ ./bus_segv 69600
Bus error: 10
...
$ ./bus_segv 73695
Bus error: 10
$ ./bus_segv 73696
...
$ ./bus_segv 77791
$ ./bus_segv 77792
Segmentation fault: 11

If you look at the disassembled code, you see that the borders of the ranges with bus errors are not as odd as the index appears:

$ otool -tv bus_segv

bus_segv:
(__TEXT,__text) section
_main:
0000000100000f60    pushq   %rbp
0000000100000f61    movq    %rsp, %rbp
0000000100000f64    subq    $0x10, %rsp
0000000100000f68    movl    $0x0, -0x4(%rbp)
0000000100000f6f    movl    %edi, -0x8(%rbp)
0000000100000f72    movq    %rsi, -0x10(%rbp)
0000000100000f76    movq    -0x10(%rbp), %rsi
0000000100000f7a    movq    0x8(%rsi), %rdi
0000000100000f7e    callq   0x100000f94 ## symbol stub for: _atol
0000000100000f83    leaq    0x96(%rip), %rsi
0000000100000f8a    movsbl  (%rsi,%rax), %eax
0000000100000f8e    addq    $0x10, %rsp
0000000100000f92    popq    %rbp    
0000000100000f93    retq    

By leaq 0x96(%rip), %rsi, rsi becomes the (PC relatively
determined) address of array's start address:

rsi = 0x100000f8a + 0x96 = 0x100001020
rsi - 4128 = 0x100000000 (below segmentation fault)
rsi + 24544 = 0x100007000 (here and above bus error)
rsi + 28640 = 0x100008000 (below bus error)
rsi + 45024 = 0x10000c000 (here and above bus error)
rsi + 53216 = 0x10000e000 (below bus error)
rsi + 69600 = 0x100012000 (here and above bus error)
rsi + 73696 = 0x100013000 (below bus error)
rsi + 77792 = 0x100014000 (here and above segmentation fault)

lldb probably sets up the process with different page limits. I was not able to reproduce any bus errors in a debug session. So the debugger might be a workaround for bus error spitting binaries.

Andreas

梦罢 2024-07-25 08:58:08

如果不是的话,这将是 什么是总线错误? 的重复为了

程序第一次会出现段错误并停止,第二次可能会出现总线错误并退出吗?

问题的一部分。 您应该能够通过此处找到的信息自己回答这个问题。


疯狂:一遍又一遍地做同样的事情却期待不同的结果。
——阿尔伯特·爱因斯坦


当然,从字面上理解这个问题......

#include <signal.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
int main() {
    srand(time(NULL));
    if (rand() % 2)
        kill(getpid(), SIGBUS);
    else
        kill(getpid(), SIGSEGV);
    return 0;
}

Tada,一个程序,可以在一次运行中因分段错误退出,并在另一次运行中因总线错误退出。

This would be a dup of What is a bus error?, if it weren't for the

Can it happen that a program gives a seg fault and stops for the first time and for the second time it may give a bus error and exit ?

part of the question. You should be able to answer this for yourself with the information found here.


Insanity: doing the same thing over and over again and expecting different results.
-- Albert Einstein


Of course, taking the question literally...

#include <signal.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
int main() {
    srand(time(NULL));
    if (rand() % 2)
        kill(getpid(), SIGBUS);
    else
        kill(getpid(), SIGSEGV);
    return 0;
}

Tada, a program that can exit with a segmentation fault on one run and exit with a bus error on another run.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文