分段违规后恢复生机
在出现分段错误错误后,是否可以恢复 C 程序的正常执行流程?
struct A {
int x;
};
A* a = 0;
a->x = 123; // this is where segmentation violation occurs
// after handling the error I want to get back here:
printf("normal execution");
// the rest of my source code....
我想要一个类似于 Java、C# 等中存在的 NullPointerException 的机制。
注意:请不要告诉我 C++ 中有异常处理机制,因为我知道,请不要告诉我我应该在赋值之前检查每个指针等。
我真正想要实现的是回到正常的执行流程,如上面的示例所示。我知道可以使用 POSIX 信号执行一些操作。它应该是什么样子?还有其他想法吗?
Is it possible to restore the normal execution flow of a C program, after the Segmentation Fault error?
struct A {
int x;
};
A* a = 0;
a->x = 123; // this is where segmentation violation occurs
// after handling the error I want to get back here:
printf("normal execution");
// the rest of my source code....
I want a mechanism similar to NullPointerException that is present in Java, C# etc.
Note: Please, don't tell me that there is an exception handling mechanism in C++ because I know that, dont' tell me I should check every pointer before assignment etc.
What I really want to achieve is to get back to normal execution flow as in the example above. I know some actions can be undertaken using POSIX signals. How should it look like? Other ideas?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(13)
如果我在生产代码中看到类似的东西,我会用球棒打败某人,但这是一个丑陋的、有趣的黑客行为。您将不知道段错误是否损坏了您的某些数据,您将没有明智的恢复方法,并且知道现在一切都很好,没有可移植的方法来执行此操作。您可以做的唯一稍微理智的事情是尝试记录错误(直接使用 write(),而不是任何 stdio 函数 - 它们不是信号安全的),并且可能重新启动程序。对于这些情况,您最好编写一个超级进程来监视子进程退出、记录它并启动一个新的子进程。
I would beat someone with a bat if I saw something like this in production code though, it's an ugly, for-fun hack. You'll have no idea if the segfault have corrupted some of your data, you'll have no sane way of recovering and know that everything is Ok now, there's no portable way of doing this. The only mildly sane thing you could do is try to log an error (use write() directly, not any of the stdio functions - they're not signal safe) and perhaps restart the program. For those cases you're much better off writing a superwisor process that monitors a child process exit, logs it and starts a new child process.
您可以使用信号处理程序捕获分段错误,并决定继续执行程序(风险自负)。
信号名称为
SIGSEGV
。您必须使用
signal.h
标头中的sigaction()
函数。基本上,它的工作方式如下:
这是处理函数的原型:
正如您所看到的,您不需要返回。程序的执行将继续,除非您决定从处理程序中自行停止它。
You can catch segmentation faults using a signal handler, and decide to continue the excecution of the program (at your own risks).
The signal name is
SIGSEGV
.You will have to use the
sigaction()
function, from thesignal.h
header.Basically, it works the following way:
Here's the prototype of the handler function:
As you can see, you don't need to return. The program's execution will continue, unless you decide to stop it by yourself from the handler.
“所有事情都是允许的,但并非所有事情都是有益的” - 通常,段错误是有充分理由的游戏结束......比在原来的位置上保留数据更好的主意是保留数据(数据库,或至少是一个文件)系统)并使其能够从中断处继续。这将为您提供更好的数据可靠性。
"All things are permissible, but not all are beneficial" - typically a segfault is game over for a good reason... A better idea than picking up where it was would be to keep your data persisted (database, or at least a file system) and enable it to pick up where it left off that way. This will give you much better data reliability all around.
请参阅 R. 对 MacMade 答案的评论。
扩展他所说的内容(在处理 SIGSEV 后,或者在这种情况下处理 SIGFPE 后,CPU+OS 可以将您返回到有问题的 insn),这是我对除以零处理的测试:
See R.'s comment to MacMade answer.
Expanding on what he said, (after handling SIGSEV, or, for that case, SIGFPE, the CPU+OS can return you to the offending insn) here is a test I have for division by zero handling:
不,从任何逻辑意义上讲,都不可能在出现分段错误后恢复正常执行。您的程序只是尝试取消引用空指针。如果你的程序期望的东西没有出现,你将如何正常进行?这是一个编程错误,唯一安全的做法就是退出。
考虑分段错误的一些可能原因:
仅在第一种情况下,您才可以继续执行
如果您有一个想要取消引用的指针,但它可能合法地为 null,您必须在尝试解除引用之前对其进行测试。我知道你不想让我告诉你这个,但这是正确的答案,太难了。
编辑:这是一个示例,说明为什么您绝对不想在取消引用空指针后继续执行下一条指令:
No, it's not possible, in any logical sense, to restore normal execution following a segmentation fault. Your program just tried to dereference a null pointer. How are you going to carry on as normal if something your program expects to be there isn't? It's a programming bug, the only safe thing to do is to exit.
Consider some of the possible causes of a segmentation fault:
Only in the first case is there any kind of reasonable expectation that you might be able to carry on
If you have a pointer that you want to dereference but it might legitimately be null, you must test it before attempting the dereference. I know you don't want me to tell you that, but it's the right answer, so tough.
Edit: here's an example to show why you definitely do not want to carry on with the next instruction after dereferencing a null pointer:
调用此函数,当发生段错误时,您的代码将执行 segv_handler,然后继续返回到原来的位置。
Call this, and when a segfault will occur, your code will execute segv_handler and then continue back to where it was.
除非您确切知道导致 SIGSEGV 的原因,否则没有任何有意义的方法可以从 SIGSEGV 中恢复,并且在标准 C 中无法做到这一点。在仪表化环境中(例如 C-VM(?)),这可能是可能的(可以想象)。对于所有程序错误信号也是如此;如果您尝试阻止/忽略它们,或者建立正常返回的处理程序,那么当它们发生时,您的程序可能会严重崩溃,除非它们是由
raise
或kill
生成的。帮自己一个忙,考虑一下错误情况。
There is no meaningful way to recover from a SIGSEGV unless you know EXACTLY what caused it, and there's no way to do that in standard C. It may be possible (conceivably) in an instrumented environment, like a C-VM (?). The same is true for all program error signals; if you try to block/ignore them, or establish handlers that return normally, your program will probably break horribly when they happen unless perhaps they're generated by
raise
orkill
.Just do yourself a favour and take error cases into account.
在 POSIX 中,当您这样做时,您的进程将收到 SIGSEGV。默认处理程序只会使您的程序崩溃。您可以使用 signal() 调用添加自己的处理程序。您可以通过自己处理信号来实现您喜欢的任何行为。
In POSIX, your process will get sent SIGSEGV when you do that. The default handler just crashes your program. You can add your own handler using the signal() call. You can implement whatever behaviour you like by handling the signal yourself.
您可以使用 SetUnhandledExceptionFilter() 函数(在 Windows 中),但即使为了能够跳过“非法”指令,您也需要能够解码一些汇编器操作码。而且,正如glowcoder所说,即使它会在运行时“注释掉”生成段错误的指令,原始程序逻辑中还会留下什么(如果可以这么称呼的话)?
一切皆有可能,但并不意味着必须要做。
You can use the SetUnhandledExceptionFilter() function (in windows), but even to be able to skip the "illegal" instruction you will need to be able to decode some assembler opcodes. And, as glowcoder said, even if it would "comment out" in runtime the instructions that generates segfaults, what will be left from the original program logic (if it may be called so)?
Everything is possible, but it doesn't mean that it has to be done.
不幸的是,在这种情况下你不能。有缺陷的函数具有未定义的行为,并且可能会损坏程序的状态。
您可以做的是在新进程中运行这些函数。如果此进程终止并返回指示 SIGSEGV 的代码,则您知道它已失败。
您也可以自己重写这些函数。
Unfortunately, you can't in this case. The buggy function has undefined behavior and could have corrupted your program's state.
What you CAN do is run the functions in a new process. If this process dies with a return code that indicates SIGSEGV, you know it has failed.
You could also rewrite the functions yourself.
我可以看到从分段违规中恢复的情况,如果您在循环中处理事件并且其中一个事件导致分段违规,那么您只想跳过该事件,继续处理其余事件。在我看来,分段违规与 Java 中的 NullPointerException 非常相似。是的,在这两种情况之后,状态将是不一致且未知的,但是在某些情况下,您希望处理这种情况并继续。例如,在 Algo 交易中,您可以暂停订单的执行并允许交易者手动接管,而不会导致整个系统崩溃并破坏所有其他订单。
I can see at case for recovering from a Segmentation Violation, if your handling events in a loop and one of these events causes a Segmentation Violation then you would only want to skip over this event, continue processing the remaining events. In my eyes Segmentation Violation are much the same as NullPointerExceptions in Java. Yes the state will be inconsistent and unknown after either of these, however in some cases you would like to handle the situation and carry on. For instance in Algo trading you would pause the execution of an order and allow a trader to manually take over, with out crashing the entire system and ruining all other orders.
最好的解决方案是以这种方式收件每个不安全的访问:
您的程序将永远不会在几乎所有操作系统中崩溃
the best solution is to inbox each unsafe access this way :
you program will never crash in almost all os
这个 glib 手册为您提供了清楚地了解如何编写信号处理程序。
在您的情况下,您将必须等待指示分段错误的 SIGSEGV。其他信号的列表可以在此处< /a>.
信号处理程序大致分为两类
全局数据结构,然后正常返回。
控制到可以从引起信号的情况中恢复的程度。
SIGSEGV
属于程序错误信号This glib manual gives you a clear picture of how to write signal handlers.
In your case you will have to wait for the SIGSEGV indicating a segmentation fault. The list of other signals can be found here.
Signal handlers are broadly classified into tow categories
global data structures, and then return normally.
control to a point where it can recover from the situation that caused the signal.
SIGSEGV
comes under program error signals