分段违规后恢复生机

发布于 2024-09-10 08:01:35 字数 489 浏览 8 评论 0原文

在出现分段错误错误后,是否可以恢复 C 程序的正常执行流程?

struct A {
    int x;
};
A* a = 0;

a->x = 123; // this is where segmentation violation occurs

// after handling the error I want to get back here:
printf("normal execution");
// the rest of my source code....

我想要一个类似于 Java、C# 等中存在的 NullPointerException 的机制。

注意:请不要告诉我 C++ 中有异常处理机制,因为我知道,请不要告诉我我应该在赋值之前检查每个指针等。

我真正想要实现的是回到正常的执行流程,如上面的示例所示。我知道可以使用 POSIX 信号执行一些操作。它应该是什么样子?还有其他想法吗?

Is it possible to restore the normal execution flow of a C program, after the Segmentation Fault error?

struct A {
    int x;
};
A* a = 0;

a->x = 123; // this is where segmentation violation occurs

// after handling the error I want to get back here:
printf("normal execution");
// the rest of my source code....

I want a mechanism similar to NullPointerException that is present in Java, C# etc.

Note: Please, don't tell me that there is an exception handling mechanism in C++ because I know that, dont' tell me I should check every pointer before assignment etc.

What I really want to achieve is to get back to normal execution flow as in the example above. I know some actions can be undertaken using POSIX signals. How should it look like? Other ideas?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(13

若水微香 2024-09-17 08:01:35
#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <signal.h>
#include <stdlib.h>
#include <ucontext.h>

void safe_func(void)
{
    puts("Safe now ?");
    exit(0); //can't return to main, it's where the segfault occured.
}

void
handler (int cause, siginfo_t * info, void *uap)
{
  //For test. Never ever call stdio functions in a signal handler otherwise*/
  printf ("SIGSEGV raised at address %p\n", info->si_addr);
  ucontext_t *context = uap;
  /*On my particular system, compiled with gcc -O2, the offending instruction
  generated for "*f = 16;" is 6 bytes. Lets try to set the instruction
  pointer to the next instruction (general register 14 is EIP, on linux x86) */
  context->uc_mcontext.gregs[14] += 6; 
  //alternativly, try to jump to a "safe place"
  //context->uc_mcontext.gregs[14] = (unsigned int)safe_func;
}

int
main (int argc, char *argv[])
{
  struct sigaction sa;
  sa.sa_sigaction = handler;
  int *f = NULL;
  sigemptyset (&sa.sa_mask);
  sa.sa_flags = SA_SIGINFO;
  if (sigaction (SIGSEGV, &sa, 0)) {
      perror ("sigaction");
      exit(1);
  }
  //cause a segfault
  *f = 16; 
  puts("Still Alive");
  return 0;
}

$ ./a.out
SIGSEGV raised at address (nil)
Still Alive

如果我在生产代码中看到类似的东西,我会用球棒打败某人,但这是一个丑陋的、有趣的黑客行为。您将不知道段错误是否损坏了您的某些数据,您将没有明智的恢复方法,并且知道现在一切都很好,没有可移植的方法来执行此操作。您可以做的唯一稍微理智的事情是尝试记录错误(直接使用 write(),而不是任何 stdio 函数 - 它们不是信号安全的),并且可能重新启动程序。对于这些情况,您最好编写一个超级进程来监视子进程退出、记录它并启动一个新的子进程。

#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <signal.h>
#include <stdlib.h>
#include <ucontext.h>

void safe_func(void)
{
    puts("Safe now ?");
    exit(0); //can't return to main, it's where the segfault occured.
}

void
handler (int cause, siginfo_t * info, void *uap)
{
  //For test. Never ever call stdio functions in a signal handler otherwise*/
  printf ("SIGSEGV raised at address %p\n", info->si_addr);
  ucontext_t *context = uap;
  /*On my particular system, compiled with gcc -O2, the offending instruction
  generated for "*f = 16;" is 6 bytes. Lets try to set the instruction
  pointer to the next instruction (general register 14 is EIP, on linux x86) */
  context->uc_mcontext.gregs[14] += 6; 
  //alternativly, try to jump to a "safe place"
  //context->uc_mcontext.gregs[14] = (unsigned int)safe_func;
}

int
main (int argc, char *argv[])
{
  struct sigaction sa;
  sa.sa_sigaction = handler;
  int *f = NULL;
  sigemptyset (&sa.sa_mask);
  sa.sa_flags = SA_SIGINFO;
  if (sigaction (SIGSEGV, &sa, 0)) {
      perror ("sigaction");
      exit(1);
  }
  //cause a segfault
  *f = 16; 
  puts("Still Alive");
  return 0;
}

$ ./a.out
SIGSEGV raised at address (nil)
Still Alive

I would beat someone with a bat if I saw something like this in production code though, it's an ugly, for-fun hack. You'll have no idea if the segfault have corrupted some of your data, you'll have no sane way of recovering and know that everything is Ok now, there's no portable way of doing this. The only mildly sane thing you could do is try to log an error (use write() directly, not any of the stdio functions - they're not signal safe) and perhaps restart the program. For those cases you're much better off writing a superwisor process that monitors a child process exit, logs it and starts a new child process.

失眠症患者 2024-09-17 08:01:35

您可以使用信号处理程序捕获分段错误,并决定继续执行程序(风险自负)。

信号名称为SIGSEGV

您必须使用 signal.h 标头中的 sigaction() 函数。

基本上,它的工作方式如下:

struct sigaction sa1;
struct sigaction sa2;

sa1.sa_handler = your_handler_func;
sa1.sa_flags   = 0;
sigemptyset( &sa1.sa_mask );

sigaction( SIGSEGV, &sa1, &sa2 );

这是处理函数的原型:

void your_handler_func( int id );

正如您所看到的,您不需要返回。程序的执行将继续,除非您决定从处理程序中自行停止它。

You can catch segmentation faults using a signal handler, and decide to continue the excecution of the program (at your own risks).

The signal name is SIGSEGV.

You will have to use the sigaction() function, from the signal.h header.

Basically, it works the following way:

struct sigaction sa1;
struct sigaction sa2;

sa1.sa_handler = your_handler_func;
sa1.sa_flags   = 0;
sigemptyset( &sa1.sa_mask );

sigaction( SIGSEGV, &sa1, &sa2 );

Here's the prototype of the handler function:

void your_handler_func( int id );

As you can see, you don't need to return. The program's execution will continue, unless you decide to stop it by yourself from the handler.

暮凉 2024-09-17 08:01:35

“所有事情都是允许的,但并非所有事情都是有益的” - 通常,段错误是有充分理由的游戏结束......比在原来的位置上保留数据更好的主意是保留数据(数据库,或至少是一个文件)系统)并使其能够从中断处继续。这将为您提供更好的数据可靠性。

"All things are permissible, but not all are beneficial" - typically a segfault is game over for a good reason... A better idea than picking up where it was would be to keep your data persisted (database, or at least a file system) and enable it to pick up where it left off that way. This will give you much better data reliability all around.

墨落成白 2024-09-17 08:01:35

请参阅 R. 对 MacMade 答案的评论。

扩展他所说的内容(在处理 SIGSEV 后,或者在这种情况下处理 SIGFPE 后,CPU+OS 可以将您返回到有问题的 insn),这是我对除以零处理的测试:

#include <stdio.h>
#include <limits.h>
#include <string.h>
#include <signal.h>
#include <setjmp.h>

static jmp_buf  context;

static void sig_handler(int signo)
{
    /* XXX: don't do this, not reentrant */
    printf("Got SIGFPE\n");

    /* avoid infinite loop */
    longjmp(context, 1);
}

int main()
{
    int a;
    struct sigaction sa;

    memset(&sa, 0, sizeof(struct sigaction));
    sa.sa_handler = sig_handler;
    sa.sa_flags = SA_RESTART;
    sigaction(SIGFPE, &sa, NULL);

    if (setjmp(context)) {
            /* If this one was on setjmp's block,
             * it would need to be volatile, to
             * make sure the compiler reloads it.
             */
            sigset_t ss;

            /* Make sure to unblock SIGFPE, according to POSIX it
             * gets blocked when calling its signal handler.
             * sigsetjmp()/siglongjmp would make this unnecessary.
             */
            sigemptyset(&ss);
            sigaddset(&ss, SIGFPE);
            sigprocmask(SIG_UNBLOCK, &ss, NULL);

            goto skip;
    }

    a = 10 / 0;
skip:
    printf("Exiting\n");

    return 0;
}

See R.'s comment to MacMade answer.

Expanding on what he said, (after handling SIGSEV, or, for that case, SIGFPE, the CPU+OS can return you to the offending insn) here is a test I have for division by zero handling:

#include <stdio.h>
#include <limits.h>
#include <string.h>
#include <signal.h>
#include <setjmp.h>

static jmp_buf  context;

static void sig_handler(int signo)
{
    /* XXX: don't do this, not reentrant */
    printf("Got SIGFPE\n");

    /* avoid infinite loop */
    longjmp(context, 1);
}

int main()
{
    int a;
    struct sigaction sa;

    memset(&sa, 0, sizeof(struct sigaction));
    sa.sa_handler = sig_handler;
    sa.sa_flags = SA_RESTART;
    sigaction(SIGFPE, &sa, NULL);

    if (setjmp(context)) {
            /* If this one was on setjmp's block,
             * it would need to be volatile, to
             * make sure the compiler reloads it.
             */
            sigset_t ss;

            /* Make sure to unblock SIGFPE, according to POSIX it
             * gets blocked when calling its signal handler.
             * sigsetjmp()/siglongjmp would make this unnecessary.
             */
            sigemptyset(&ss);
            sigaddset(&ss, SIGFPE);
            sigprocmask(SIG_UNBLOCK, &ss, NULL);

            goto skip;
    }

    a = 10 / 0;
skip:
    printf("Exiting\n");

    return 0;
}
不离久伴 2024-09-17 08:01:35

不,从任何逻辑意义上讲,都不可能在出现分段错误后恢复正常执行。您的程序只是尝试取消引用空指针。如果你的程序期望的东西没有出现,你将如何正常进行?这是一个编程错误,唯一安全的做法就是退出。

考虑分段错误的一些可能原因:

  • 您忘记为指针分配合法值
  • 指针已被覆盖 可能是因为您正在访问堆内存 您已释放了
  • 错误 损坏了堆
  • 错误损坏了堆栈
  • 恶意第三方正在尝试利用缓冲区溢出漏洞
  • malloc 返回 null,因为内存不足

仅在第一种情况下,您才可以继续执行

如果您有一个想要取消引用的指针,但它可能合法地为 null,您必须在尝试解除引用之前对其进行测试。我知道你不想让我告诉你这个,但这是正确的答案,太难了。

编辑:这是一个示例,说明为什么您绝对不想在取消引用空指针后继续执行下一条指令:

void foobarMyProcess(struct SomeStruct* structPtr)
{
    char* aBuffer = structPtr->aBigBufferWithLotsOfSpace; // if structPtr is NULL, will SIGSEGV
    //
    // if you SIGSEGV and come back to here, at this point aBuffer contains whatever garbage was in memory at the point
    // where the stack frame was created
    //
    strcpy(aBuffer, "Some longish string");  // You've just written the string to some random location in your address space
                                             // good luck with that!

}

No, it's not possible, in any logical sense, to restore normal execution following a segmentation fault. Your program just tried to dereference a null pointer. How are you going to carry on as normal if something your program expects to be there isn't? It's a programming bug, the only safe thing to do is to exit.

Consider some of the possible causes of a segmentation fault:

  • you forgot to assign a legitimate value to a pointer
  • a pointer has been overwritten possibly because you are accessing heap memory you have freed
  • a bug has corrupted the heap
  • a bug has corrupted the stack
  • a malicious third party is attempting a buffer overflow exploit
  • malloc returned null because you have run out of memory

Only in the first case is there any kind of reasonable expectation that you might be able to carry on

If you have a pointer that you want to dereference but it might legitimately be null, you must test it before attempting the dereference. I know you don't want me to tell you that, but it's the right answer, so tough.

Edit: here's an example to show why you definitely do not want to carry on with the next instruction after dereferencing a null pointer:

void foobarMyProcess(struct SomeStruct* structPtr)
{
    char* aBuffer = structPtr->aBigBufferWithLotsOfSpace; // if structPtr is NULL, will SIGSEGV
    //
    // if you SIGSEGV and come back to here, at this point aBuffer contains whatever garbage was in memory at the point
    // where the stack frame was created
    //
    strcpy(aBuffer, "Some longish string");  // You've just written the string to some random location in your address space
                                             // good luck with that!

}
挽清梦 2024-09-17 08:01:35

调用此函数,当发生段错误时,您的代码将执行 segv_handler,然后继续返回到原来的位置。

void segv_handler(int)
{
  // Do what you want here
}

signal(SIGSEGV, segv_handler);

Call this, and when a segfault will occur, your code will execute segv_handler and then continue back to where it was.

void segv_handler(int)
{
  // Do what you want here
}

signal(SIGSEGV, segv_handler);
回忆追雨的时光 2024-09-17 08:01:35

除非您确切知道导致 SIGSEGV 的原因,否则没有任何有意义的方法可以从 SIGSEGV 中恢复,并且在标准 C 中无法做到这一点。在仪表化环境中(例如 C-VM(?)),这可能是可能的(可以想象)。对于所有程序错误信号也是如此;如果您尝试阻止/忽略它们,或者建立正常返回的处理程序,那么当它们发生时,您的程序可能会严重崩溃,除非它们是由 raisekill 生成的。

帮自己一个忙,考虑一下错误情况。

There is no meaningful way to recover from a SIGSEGV unless you know EXACTLY what caused it, and there's no way to do that in standard C. It may be possible (conceivably) in an instrumented environment, like a C-VM (?). The same is true for all program error signals; if you try to block/ignore them, or establish handlers that return normally, your program will probably break horribly when they happen unless perhaps they're generated by raise or kill.

Just do yourself a favour and take error cases into account.

风启觞 2024-09-17 08:01:35

在 POSIX 中,当您这样做时,您的进程将收到 SIGSEGV。默认处理程序只会使您的程序崩溃。您可以使用 signal() 调用添加自己的处理程序。您可以通过自己处理信号来实现您喜欢的任何行为。

In POSIX, your process will get sent SIGSEGV when you do that. The default handler just crashes your program. You can add your own handler using the signal() call. You can implement whatever behaviour you like by handling the signal yourself.

陌若浮生 2024-09-17 08:01:35

您可以使用 SetUnhandledExceptionFilter() 函数(在 Windows 中),但即使为了能够跳过“非法”指令,您也需要能够解码一些汇编器操作码。而且,正如glowcoder所说,即使它会在运行时“注释掉”生成段错误的指令,原始程序逻辑中还会留下什么(如果可以这么称呼的话)?
一切皆有可能,但并不意味着必须要做。

You can use the SetUnhandledExceptionFilter() function (in windows), but even to be able to skip the "illegal" instruction you will need to be able to decode some assembler opcodes. And, as glowcoder said, even if it would "comment out" in runtime the instructions that generates segfaults, what will be left from the original program logic (if it may be called so)?
Everything is possible, but it doesn't mean that it has to be done.

许一世地老天荒 2024-09-17 08:01:35

不幸的是,在这种情况下你不能。有缺陷的函数具有未定义的行为,并且可能会损坏程序的状态。

您可以做的是在新进程中运行这些函数。如果此进程终止并返回指示 SIGSEGV 的代码,则您知道它已失败。

您也可以自己重写这些函数。

Unfortunately, you can't in this case. The buggy function has undefined behavior and could have corrupted your program's state.

What you CAN do is run the functions in a new process. If this process dies with a return code that indicates SIGSEGV, you know it has failed.

You could also rewrite the functions yourself.

客…行舟 2024-09-17 08:01:35

我可以看到从分段违规中恢复的情况,如果您在循环中处理事件并且其中一个事件导致分段违规,那么您只想跳过该事件,继续处理其余事件。在我看来,分段违规与 Java 中的 NullPointerException 非常相似。是的,在这两种情况之后,状态将是不一致且未知的,但是在某些情况下,您希望处理这种情况并继续。例如,在 Algo 交易中,您可以暂停订单的执行并允许交易者手动接管,而不会导致整个系统崩溃并破坏所有其他订单。

I can see at case for recovering from a Segmentation Violation, if your handling events in a loop and one of these events causes a Segmentation Violation then you would only want to skip over this event, continue processing the remaining events. In my eyes Segmentation Violation are much the same as NullPointerExceptions in Java. Yes the state will be inconsistent and unknown after either of these, however in some cases you would like to handle the situation and carry on. For instance in Algo trading you would pause the execution of an order and allow a trader to manually take over, with out crashing the entire system and ruining all other orders.

送君千里 2024-09-17 08:01:35

最好的解决方案是以这种方式收件每个不安全的访问:

#include <iostream>
#include <signal.h>
#include <setjmp.h>
static jmp_buf buf;
int counter = 0;
void signal_handler(int)
{
     longjmp(buf,0);
}
int main()
{
    signal(SIGSEGV,signal_handler);
    setjmp(buf);
    if(counter++ == 0){ // if we did'nt try before
    *(int*)(0x1215) = 10;  // access an other process's memory
    }
    std::cout<<"i am alive !!"<<std::endl; // we will get into here in any case
    system("pause");
 return 0;   
}

您的程序将永远不会在几乎所有操作系统中崩溃

the best solution is to inbox each unsafe access this way :

#include <iostream>
#include <signal.h>
#include <setjmp.h>
static jmp_buf buf;
int counter = 0;
void signal_handler(int)
{
     longjmp(buf,0);
}
int main()
{
    signal(SIGSEGV,signal_handler);
    setjmp(buf);
    if(counter++ == 0){ // if we did'nt try before
    *(int*)(0x1215) = 10;  // access an other process's memory
    }
    std::cout<<"i am alive !!"<<std::endl; // we will get into here in any case
    system("pause");
 return 0;   
}

you program will never crash in almost all os

放飞的风筝 2024-09-17 08:01:35

这个 glib 手册为您提供了清楚地了解如何编写信号处理程序。

A signal handler is just a function that you compile together with the rest
of the program. Instead of directly invoking the function, you use signal 
or sigaction to tell the operating system to call it when a signal arrives.
This is known as establishing the handler.

在您的情况下,您将必须等待指示分段错误的 SIGSEGV。其他信号的列表可以在此处< /a>.

信号处理程序大致分为两类

  1. 您可以让处理程序函数注意到信号是通过调整一些来到达的
    全局数据结构,然后正常返回。
  2. 您可以让处理函数终止程序或转移
    控制到可以从引起信号的情况中恢复的程度。

SIGSEGV 属于程序错误信号

This glib manual gives you a clear picture of how to write signal handlers.

A signal handler is just a function that you compile together with the rest
of the program. Instead of directly invoking the function, you use signal 
or sigaction to tell the operating system to call it when a signal arrives.
This is known as establishing the handler.

In your case you will have to wait for the SIGSEGV indicating a segmentation fault. The list of other signals can be found here.

Signal handlers are broadly classified into tow categories

  1. You can have the handler function note that the signal arrived by tweaking some
    global data structures, and then return normally.
  2. You can have the handler function terminate the program or transfer
    control to a point where it can recover from the situation that caused the signal.

SIGSEGV comes under program error signals

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文