C 语言中的多管道代码有意义吗?
几天来我创建了一个关于此问题的 问题。 我的解决方案符合已接受答案中建议的内容。 然而,我的一个朋友提出了以下解决方案:
请注意,代码已经更新了几次(检查编辑修订)以反映下面答案中的建议。 如果您打算给出新的答案,请记住这个新代码,而不是有很多问题的旧代码。
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
int main(int argc, char *argv[]){
int fd[2], i, aux, std0, std1;
do {
std0 = dup(0); // backup stdin
std1 = dup(1); // backup stdout
// let's pretend I'm reading commands here in a shell prompt
READ_COMMAND_FROM_PROMPT();
for(i=1; i<argc; i++) {
// do we have a previous command?
if(i > 1) {
dup2(aux, 0);
close(aux);
}
// do we have a next command?
if(i < argc-1) {
pipe(fd);
aux = fd[0];
dup2(fd[1], 1);
close(fd[1]);
}
// last command? restore stdout...
if(i == argc-1) {
dup2(std1, 1);
close(std1);
}
if(!fork()) {
// if not last command, close all pipe ends
// (the child doesn't use them)
if(i < argc-1) {
close(std0);
close(std1);
close(fd[0]);
}
execlp(argv[i], argv[i], NULL);
exit(0);
}
}
// restore stdin to be able to keep using the shell
dup2(std0, 0);
close(std0);
}
return 0;
}
这会像 bash 一样通过管道模拟一系列命令,例如: cmd1 | cmd2 | ... | cmd_n。 我说“模拟”,因为如您所见,命令实际上是从参数中读取的。 只是为了腾出时间编写一个简单的 shell 提示符...
当然,还有一些问题需要修复和添加错误处理之类的内容,但这不是这里的重点。 我想我有点明白了代码,但它仍然让我很困惑这整个事情是如何工作的。
我是否遗漏了某些东西,或者这确实有效并且是解决问题的一个很好且干净的解决方案? 如果没有,任何人都可以指出我这段代码存在的关键问题吗?
I've created a question about this a few days. My solution is something in the lines of what was suggested in the accepted answer. However, a friend of mine came up with the following solution:
Please note that the code has been updated a few times (check the edit revisions) to reflect the suggestions in the answers below. If you intend to give a new answer, please do so with this new code in mind and not the old one which had lots of problems.
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
int main(int argc, char *argv[]){
int fd[2], i, aux, std0, std1;
do {
std0 = dup(0); // backup stdin
std1 = dup(1); // backup stdout
// let's pretend I'm reading commands here in a shell prompt
READ_COMMAND_FROM_PROMPT();
for(i=1; i<argc; i++) {
// do we have a previous command?
if(i > 1) {
dup2(aux, 0);
close(aux);
}
// do we have a next command?
if(i < argc-1) {
pipe(fd);
aux = fd[0];
dup2(fd[1], 1);
close(fd[1]);
}
// last command? restore stdout...
if(i == argc-1) {
dup2(std1, 1);
close(std1);
}
if(!fork()) {
// if not last command, close all pipe ends
// (the child doesn't use them)
if(i < argc-1) {
close(std0);
close(std1);
close(fd[0]);
}
execlp(argv[i], argv[i], NULL);
exit(0);
}
}
// restore stdin to be able to keep using the shell
dup2(std0, 0);
close(std0);
}
return 0;
}
This simulates a series of commands through pipes like in bash, for instance: cmd1 | cmd2 | ... | cmd_n. I say "simulate", because, as you can see, the commands are actually read from the arguments. Just to spare time coding a simple shell prompt...
Of course there are some issues to fix and to add like error handling but that's not the point here. I think I kinda get the code but it still makes me a lot of confusing how this whole thing works.
Am I missing something or this really works and it's a nice and clean solution to solve the problem? If not, can anyone point me the crucial problems this code has?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
看起来很合理,尽管它确实需要修复子级和循环后泄漏的
std
和aux
,并且父级的原始stdin
永远丢失。如果使用颜色可能会更好...
foo
getsstdin=stdin
,stdout=pipe1[1]
bar
> 获取stdin=pipe1[0]
、stdout=pipe2[1]
baz
获取stdin=pipe2[0]
>,stdout=stdout
我的建议不同,因为它避免破坏父级的
stdin
和stdout
,只在子级中操作它们,而永远不会泄漏任何 FD。 不过,绘制图表有点困难。编辑
您更新的代码确实修复了之前的 FD 泄漏...但添加了一个:您现在正在将
std0
泄漏给子级。 正如 Jon 所说,这对于大多数程序来说可能并不危险……但您仍然应该编写一个比这更好的 shell。即使它是暂时的,我也强烈建议不要破坏你自己的 shell 的标准 in/out/err (0/1/2),而只在 exec 之前的子进程中这样做。 为什么? 假设您在中间添加了一些
printf
调试,或者由于错误情况需要退出。 如果你不先清理混乱的标准文件描述符,你就会遇到麻烦。 为了让事情即使在意外情况下也能按预期运行,请在需要之前不要乱动它们。编辑
正如我在其他评论中提到的,将其分成更小的部分使其更容易理解。 这个小助手应该很容易理解并且没有错误:
就像这样:
您可以看到 Bash 的
execute_cmd.c#execute_disk_command
从execute_cmd.c#execute_pipeline
调用,xsh 的process.c#process_run
被jobs.c#job_run
调用,甚至每一个 BusyBox 的 各种 小 和 最小 shell 将它们分开。Looks reasonable, though it really needs to fix leaking
std
andaux
to the children and after the loop, and the parent's originalstdin
is lost forever.This would probably be better with color...
foo
getsstdin=stdin
,stdout=pipe1[1]
bar
getsstdin=pipe1[0]
,stdout=pipe2[1]
baz
getsstdin=pipe2[0]
,stdout=stdout
My suggestion is different in that it avoids mangling the parent's
stdin
andstdout
, only manipulating them within the child, and never leaks any FDs. It's a bit harder to diagram, though.Edit
Your updated code does fix the previous FD leaks… but adds one: you're now leaking
std0
to the children. As Jon says, this is probably not dangerous to most programs... but you still should write a better behaved shell than this.Even if it's temporary, I would strongly recommend against mangling your own shell's standard in/out/err (0/1/2), only doing so within the child right before exec. Why? Suppose you add some
printf
debugging in the middle, or you need to bail out due to an error condition. You'll be in trouble if you don't clean up your messed-up standard file descriptors first. Please, for the sake of having things operate as expected even in unexpected scenarios, don't muck with them until you need to.Edit
As I mentioned in other comments, splitting it up into smaller parts makes it much easier to understand. This small helper should be easily understandable and bug-free:
As should this:
You can see Bash's
execute_cmd.c#execute_disk_command
being called fromexecute_cmd.c#execute_pipeline
, xsh'sprocess.c#process_run
being called fromjobs.c#job_run
, and even every single one of BusyBox's various small and minimal shells splits them up.关键问题是您创建了一堆管道并且不确保所有末端都正确关闭。 如果创建一个管道,您将获得两个文件描述符; 如果你分叉,那么你就有四个文件描述符。 如果将管道的一端
dup()
或dup2()
连接到标准描述符,则需要关闭管道的两端 - 至少其中一个关闭必须在 dup() 或 dup2() 操作之后。考虑第一个命令可用的文件描述符(假设至少有两个 - 一般情况下应该处理的东西(不需要
pipe()
或仅需要一个命令的 I/O 重定向),但我认识到消除了错误处理以保持代码适合 SO):请注意,因为
fd[0]
在子进程中没有关闭,所以子进程永远不会在其标准输入上获得 EOF; 这通常是有问题的。std
的非闭包不太重要。重新审视修改后的代码(截至 2009-06-03T20:52-07:00)...
假设进程以仅打开文件描述符 0、1、2(标准输入、输出、错误)开始。 还假设我们正好有 3 个命令要处理。 和以前一样,这段代码用注释写出了循环。
因此,所有子级都将原始标准输入连接为文件描述符 3。这并不理想,尽管它并没有造成可怕的创伤; 我很难找到一个重要的情况。
在父级中关闭文件描述符 4 是一个错误 - “读取命令并处理它”的下一次迭代将不起作用,因为
std1
未在循环内初始化。一般来说,这接近正确——但并不完全正确。
The key problem is that you create a bunch of pipes and don't make sure that all the ends are closed properly. If you create a pipe, you get two file descriptors; if you fork, then you have four file descriptors. If you
dup()
ordup2()
one end of the pipe to a standard descriptor, you need to close both ends of the pipe - at least one of the closes must be after the dup() or dup2() operation.Consider the file descriptors available to the first command (assuming there are at least two - something that should be handled in general (no
pipe()
or I/O redirection needed with just one command), but I recognize that the error handling is eliminated to keep the code suitable for SO):Note that because
fd[0]
is not closed in the child, the child will never get EOF on its standard input; this is usually problematic. The non-closure ofstd
is less critical.Revisiting amended code (as of 2009-06-03T20:52-07:00)...
Assume that process starts with file descriptors 0, 1, 2 (standard input, output, error) open only. Also assume we have exactly 3 commands to process. As before, this code writes out the loop with annotations.
So, all the children have the original standard input connected as file descriptor 3. This is not ideal, though it is not dreadfully traumatic; I'm hard pressed to find a circumstance where this would matter.
Closing file descriptor 4 in the parent is a mistake - the next iteration of 'read a command and process it won't work because
std1
is not initialized inside the loop.Generally, this is close to correct - but not quite correct.
它会给出一些结果,有些是意想不到的。 这远不是一个好的解决方案:它扰乱了父进程的标准描述符,不恢复标准输入,描述符泄漏给子进程等。
如果您递归地思考,可能会更容易理解。 下面是正确的解决方案,没有进行错误检查。 考虑一个链表类型
command
,它有一个next
指针和一个argv
数组。使用链表中的第一个命令调用它,并且
input
= -1。 剩下的事情它都会做。It will give results, some that are not expected. It is far from a nice solution: It messes with the parent process' standard descriptors, does not recover the standard input, descriptors leak to children, etc.
If you think recursively, it may be easier to understand. Below is a correct solution, without error checking. Consider a linked-list type
command
, with it'snext
pointer and aargv
array.Call it with the first command in the linked-list, and
input
= -1. It does the rest.在这个问题和另一个问题(如第一篇文章中链接的)中,ephemient 都建议我解决问题,而不会弄乱父文件描述符,如该问题中的可能解决方案所示。
我没有得到他的解决方案,我尝试着理解,但我似乎无法理解。 我也尝试在不理解的情况下对其进行编码,但没有成功。 可能是因为我未能正确理解它并且无法对其应该进行编码的代码进行编码。
无论如何,我尝试使用我从伪代码中理解的一些内容来提出自己的解决方案,并提出了这个:
这可能不是最好和最干净的解决方案,但它是我可以想出的东西,最重要的是,我能理解的东西。 如果有一些我不理解的东西在工作,然后我的老师对我进行评估,而我却无法向他解释代码的作用,这有什么好处呢?
无论如何,您对此有何看法?
Both in this question and in another (as linked in the first post), ephemient suggested me a solution to the problem without messing with the parents file descriptors as demonstrated by a possible solution in this question.
I didn't get his solution, I tried and tried to understand but I can't seem to get it. I also tried to code it without understanding but it didn't work. Probably because I've failed to understand it correctly and wasn't able to code it the it should have been coded.
Anyway, I tried to come up with my own solution using some of the things I understood from the pseudo code and came up with this:
This may not be the best and cleanest solution but it was something I could come up with and, most importantly, something I can understand. What good is to have something working that I don't understand and then I'm evaluated by my teacher and I can't explain to him what the code is doing?
Anyway, what do you think about this one?
这是我的“最终”代码,带有ephemient建议:
现在可以吗?
This is my "final" code with ephemient suggestions:
Is it ok now?