什么会导致 exec 失败?接下来会发生什么?

发布于 2024-09-19 16:52:10 字数 78 浏览 2 评论 0原文

exec(execl、execlp 等)失败的原因是什么?如果您调用 exec 并且它返回,除了恐慌和调用 exit 之外还有其他最佳实践吗?

What are the reasons that an exec (execl,execlp, etc.) can fail? If you make a call to exec and it returns, are there any best practices other than just panicking and calling exit?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

时光磨忆 2024-09-26 16:52:11

处理 exec 失败的问题在于,通常 exec 是在子进程中执行的,而您希望在父进程中进行错误处理。但您不能只是 exit(errno) 因为 (1) 您不知道错误代码是否适合退出代码,并且 (2) 您无法区分 失败>exec 以及您exec 的新程序的失败退出代码。

我所知道的最好的解决方案是使用管道来传达 exec 的成功或失败:

  1. 在分叉之前,在父进程中打开一个管道。
  2. 分叉后,父进程关闭管道的写入端并从读取端读取。
  3. 子进程关闭读取端并为写入端设置 close-on-exec 标志。
  4. 孩子调用exec。
  5. 如果 exec 失败,子进程将使用管道将错误代码写回父进程,然后退出。
  6. 如果子进程成功执行了 exec,父进程将读取 eof(零长度读取),因为 close-on-exec 成功地关闭了管道的写入端。或者,如果 exec 失败,父级会读取错误代码并可以相应地继续。无论哪种方式,父进程都会阻塞,直到子进程调用 exec
  7. 父级关闭管道的读取端。

The problem with handling exec failure is that usually exec is performed in a child process, and you want to do the error handling in the parent process. But you can't just exit(errno) because (1) you don't know if error codes fit in an exit code, and (2), you can't distinguish between failure to exec and failure exit codes from the new program you exec.

The best solution I know is using pipes to communicate the success or failure of exec:

  1. Before forking, open a pipe in the parent process.
  2. After forking, the parent closes the writing end of the pipe and reads from the reading end.
  3. The child closes the reading end and sets the close-on-exec flag for the writing end.
  4. The child calls exec.
  5. If exec fails, the child writes the error code back to the parent using the pipe, then exits.
  6. The parent reads eof (a zero-length read) if the child successfully performed exec, since close-on-exec made successful exec close the writing end of the pipe. Or, if exec failed, the parent reads the error code and can proceed accordingly. Either way, the parent blocks until the child calls exec.
  7. The parent closes the reading end of the pipe.
染年凉城似染瑾 2024-09-26 16:52:11

来自 exec(3 ) 手册页

execl()execle()execlp()execvp() 和 < code>execvP() 函数可能会失败,并为库函数 execve(2)malloc(3) 指定的任何错误设置 errno。< /p>

execv() 函数可能会失败,并为库函数 execve(2) 指定的任何错误设置 errno。

然后从 execve(2) 手册页

错误

Execve() 将失败并返回到调用进程,如果:

  • [E2BIG] - 新进程的参数列表中的字节数大于系统施加的限制。此限制由 sysctl(3) MIB 变量 KERN_ARGMAX 指定。
  • [EACCES] - 路径前缀组件的搜索权限被拒绝。
  • [EACCES] - 新流程文件不是普通文件。
  • [EACCES] - 新的进程文件模式拒绝执行权限。
  • [EACCES] - 新进程文件位于已禁用执行的文件系统上(中的 MNT_NOEXEC) >).
  • [EFAULT] - 新进程文件的长度没有其标头中的大小值指示的那么长。
  • [EFAULT] - 路径、argv 或 envp 指向非法地址。
  • [EIO] - 从文件系统读取时发生 I/O 错误。
  • [ELOOP] - 转换路径名时遇到太多符号链接。这被认为是循环符号链接的指示。
  • [ENAMETOOLONG] - 路径名的某个组成部分超出了 {NAME_MAX} 个字符,或者整个路径名超出了 {PATH_MAX} 个字符。
  • [ENOENT] - 新进程文件不存在。
  • [ENOEXEC] - 新进程文件具有适当的访问权限,但格式无法识别(例如,标头中的幻数无效)。
  • [ENOMEM] - 新进程需要的虚拟内存超出了规定的最大值 (getrlimit(2)) 所允许的数量。
  • [ENOTDIR] - 路径前缀的组成部分不是目录。
  • [ETXTBSY] - 新进程文件是一个纯过程(共享文本)文件,当前由某个进程打开以供写入或读取。


malloc() 的复杂性要低得多,并且仅使用 ENOMEM。来自 malloc(3) 手册页

如果成功,calloc()malloc()realloc()reallocf()、和 valloc() 函数返回指向已分配内存的指针。如果出现错误,它们将返回一个NULL指针并将errno设置为ENOMEM

From the exec(3) man page:

The execl(), execle(), execlp(), execvp(), and execvP() functions may fail and set errno for any of the errors specified for the library functions execve(2) and malloc(3).

The execv() function may fail and set errno for any of the errors specified for the library function execve(2).

And then from the execve(2) man page:

ERRORS

Execve() will fail and return to the calling process if:

  • [E2BIG] - The number of bytes in the new process's argument list is larger than the system-imposed limit. This limit is specified by the sysctl(3) MIB variable KERN_ARGMAX.
  • [EACCES] - Search permission is denied for a component of the path prefix.
  • [EACCES] - The new process file is not an ordinary file.
  • [EACCES] - The new process file mode denies execute permission.
  • [EACCES] - The new process file is on a filesystem mounted with execution disabled (MNT_NOEXEC in <sys/mount.h>).
  • [EFAULT] - The new process file is not as long as indicated by the size values in its header.
  • [EFAULT] - Path, argv, or envp point to an illegal address.
  • [EIO] - An I/O error occurred while reading from the file system.
  • [ELOOP] - Too many symbolic links were encountered in translating the pathname. This is taken to be indicative of a looping symbolic link.
  • [ENAMETOOLONG] - A component of a pathname exceeded {NAME_MAX} characters, or an entire path name exceeded {PATH_MAX} characters.
  • [ENOENT] - The new process file does not exist.
  • [ENOEXEC] - The new process file has the appropriate access permission, but has an unrecognized format (e.g., an invalid magic number in its header).
  • [ENOMEM] - The new process requires more virtual memory than is allowed by the imposed maximum (getrlimit(2)).
  • [ENOTDIR] - A component of the path prefix is not a directory.
  • [ETXTBSY] - The new process file is a pure procedure (shared text) file that is currently open for writing or reading by some process.

malloc() is a lot less complicated, and uses only ENOMEM. From the malloc(3) man page:

If successful, calloc(), malloc(), realloc(), reallocf(), and valloc() functions return a pointer to allocated memory. If there is an error, they return a NULL pointer and set errno to ENOMEM.

就像说晚安 2024-09-26 16:52:11

exec() 调用返回后执行的操作取决于上下文 - 程序应该执行的操作、错误是什么以及您可以采取哪些措施来解决该问题。

麻烦的根源之一可能是您指定了一个简单的程序名而不是路径名;也许您可以使用 execvp() 重试,或者将命令转换为 sh -c 'what you入门' 的调用。其中任何一个是否合理取决于应用程序。如果涉及重大安全问题,您可能不会再尝试。

如果您指定了一个路径名并且存在问题(ENOTDIR、ENOENT、EPERM),那么您可能没有任何明智的回退,但您可以有意义地报告错误。

在过去(10 多年前),某些系统不支持“#!” shebang 表示法,如果您不确定正在执行可执行文件还是 shell 脚本,您可以将其作为可执行文件尝试,然后将其作为 shell 脚本重试。如果您正在运行 Perl 脚本,这可能会起作用,也可能不起作用,但在那些日子里,您编写 Perl 脚本来检测它们是否由 shell 运行,并使用 Perl 重新执行它们自己。幸运的是,那些日子已经过去了。

在可能的范围内,重要的是确保进程报告问题,以便可以追踪它 - 将其消息写入日志文件或仅写入 stderr (甚至可能 syslog()),以便那些必须找出问题所在的人可以获得更多信息帮助他们,而不是不幸的最终用户的报告“我尝试了 X,但它不起作用”。至关重要的是,如果没有任何效果,则退出状态不为 0,因为 0 表示成功。即使这一点也可能会被忽略——但你已经尽力了。

What you do after the exec() call returns depends on the context - what the program is supposed to do, what the error is, and what you might be able to do to work around the problem.

One source of trouble could be that you specified a simple program name instead of a pathname; maybe you could retry with execvp(), or convert the command into an invocation of sh -c 'what you originally specified'. Whether any of these is reasonable depends on the application. If there are major security issues involved, probably you don't try again.

If you specified a pathname and there is a problem with that (ENOTDIR, ENOENT, EPERM), then you may not have any sensible fallback, but you can report the error meaningfully.

In the old days (10+ years ago), some systems did not support the '#!' shebang notation, and if you were not sure whether you were executing an executable or a shell script, you tried it as an executable and then retried it as a shell script. That might or might not work if you were running a Perl script, but in those days, you wrote your Perl scripts to detect that they were being run by a shell and to re-exec themselves with Perl. Fortunately, those days are mostly over.

To the extent possible, it is important to ensure that the process reports the problem so that it can be traced - writing its message to a log file or just to stderr (or maybe even syslog()), so that those who have to work out what went wrong have more information to help them other than the hapless end user's report "I tried X and it didn't work". It is crucial that if nothing works, then the exit status is not 0 as that indicates success. Even that might be ignored - but you did what you could.

翻了热茶 2024-09-26 16:52:11

除了恐慌之外,您还可以根据 errno 的值做出决定。

Other than just panicking, you could take a decision based on errno's value.

怎言笑 2024-09-26 16:52:11

Exec 应该总是成功
(shell 除外,例如,如果用户输入了虚假命令)。

如果 exec 确实失败,则表明:

  • 程序出现“故障”(组件丢失或损坏、路径名错误、内存损坏……),或
  • 严重的系统错误(内存不足、进程太多、磁盘故障等)。 ..)

对于任何严重错误,通常的方法是将错误消息写入 stderr,然后以失败代码退出。几乎所有标准工具都这样做。对于 exec:

execl("bork", "bork", NULL);
perror("failed: exec");
exit(127);

shell 也这样做(或多或少)。

通常,如果子进程失败,父进程也会失败并且应该退出。子进程是否在 exec 中失败或在运行程序时失败并不重要。如果 exec 失败,那么 exec 失败的原因并不重要。如果子进程由于任何原因失败,调用进程就会遇到麻烦并且需要停止。

不要浪费大量时间尝试预测所有可能的错误情况。不要编写试图以最佳方式处理每个错误代码的代码。您只会使代码变得臃肿,并引入许多新的错误。如果你的程序被破坏或者被滥用,它就会失败。如果你强迫它继续下去,将会带来更严重的麻烦。

例如,如果系统内存不足并且交换交换受到影响,我们不想一遍又一遍地尝试运行进程;这只会使情况变得更糟。如果我们收到文件系统错误,我们不想继续在该文件系统上运行;这可能会使腐败变得更加严重。如果程序安装错误,或者有错误,或者内存损坏,我们希望尽快停止,以免损坏的程序造成真正的损害(例如向客户端发送损坏的报告,破坏数据库,. ..)。

一种可能的替代方案:失败的进程可能会请求帮助,暂停自身 (SIGSTOP),然后在被告知继续时重试该操作。当系统内存不足、磁盘已满或者程序出现故障时,这可能会有所帮助。很少有手术如此昂贵和重要以至于值得这样做。

如果您正在制作交互式 GUI 程序,请尝试将其作为可重用命令行工具的薄包装器(如果出现问题则退出)。程序中的每个函数都应该可以通过 GUI、命令行以及函数调用来访问。写出你的函数。编写一些工具来为任何函数制作命令行和 GUI 包装器。也使用子流程。

如果你正在制作一个真正关键的系统,例如核电站的控制器,或者预测海啸的程序,那么你在读我的愚蠢建议做什么?关键系统不应完全依赖于计算机或软件。需要有一个“手动超驰”,有人来驾驶它。特别是,不要尝试在 MS Windows 上构建关键系统;这就像在水下建造沙堡一样。

Exec should always succeed
(except for shells, e.g. if the user entered a bogus command).

If exec does fail, it indicates:

  • a "fault" with the program (missing or bad component, wrong pathname, bad memory, ...), or
  • a serious system error (out of memory, too many processes, disk fault, ...)

For any serious error, the normal approach is to write the error message on stderr, then exit with a failure code. Almost all of the standard tools do this. For exec:

execl("bork", "bork", NULL);
perror("failed: exec");
exit(127);

The shell does that, too (more or less).

Normally if a child process fails, the parent has failed too and should exit. It does not matter whether the child failed in exec, or while running the program. If exec failed, it does not matter why exec failed. If the child process failed for any reason, the calling process is in trouble and needs to stop.

Don't waste lots of time trying to anticipate all possible error conditions. Don't write code that tries to handle each error code in the best possible way. You'll just bloat the code, and introduce many new bugs. If your program is broken, or it's being abused, it should simply fail. If you force it to continue, worse trouble will come of that.

For example, if the system is out of memory and thrashing swap, we don't want to cycle over and over trying to run a process; it would just make the situation worse. If we get a filesystem error, we don't want to continue running on that filesystem; it might make the corruption worse. If the program was installed wrongly, or has a bug, or has memory corruption, we want to stop as soon as possible, before that broken program does some real damage (such as sending a corrupted report to a client, trashing a database, ...).

One possible alternative: a failing process might call for help, pause itself (SIGSTOP), then retry the operation if told to continue. This could help when the system is out of memory, or disks are full, or perhaps even if there is a fault in the program. Few operations are so expensive and important that this would be worthwhile.

If you're making an interactive GUI program, try to do it as a thin wrapper over reusable command-line tools (which exit if something goes wrong). Every function in your program should be accessible through the GUI, through the command-line, and as a function call. Write your functions. Write a few tools to make commmand-line and GUI wrappers for any function. Use sub-processes too.

If you are making a truly critical system, such as a controller for a nuclear power station, or a program to predict tsunamis, then what are you doing reading my dumb advice? Critical systems should not depend entirely on computers or software. There needs to be a 'manual override', with someone to drive it. Especially, do not attempt to build a critical system on MS Windows; that is like building sandcastles underwater.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文