Go exec.CommandContext 在上下文超时后不会被终止
在golang中,我通常可以将context.WithTimeout()
与exec.CommandContext()
结合使用来让命令在超时后自动被终止(使用SIGKILL)。
但我遇到了一个奇怪的问题,如果我用 sh -c
包装命令 AND 通过设置 cmd.Stdout = &bytes 缓冲命令的输出.Buffer{}
,超时不再起作用,命令将永远运行。
为什么会出现这种情况?
这是一个最小的可重现示例:
package main
import (
"bytes"
"context"
"os/exec"
"time"
)
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond)
defer cancel()
cmdArgs := []string{"sh", "-c", "sleep infinity"}
bufferOutputs := true
// Uncommenting *either* of the next two lines will make the issue go away:
// cmdArgs = []string{"sleep", "infinity"}
// bufferOutputs = false
cmd := exec.CommandContext(ctx, cmdArgs[0], cmdArgs[1:]...)
if bufferOutputs {
cmd.Stdout = &bytes.Buffer{}
}
_ = cmd.Run()
}
我用 Linux 标记了这个问题,因为我只验证了这种情况发生在 Ubuntu 20.04 上,并且我不确定它是否会在其他平台上重现。
In golang, I can usually use context.WithTimeout()
in combination with exec.CommandContext()
to get a command to automatically be killed (with SIGKILL) after the timeout.
But I'm running into a strange issue that if I wrap the command with sh -c
AND buffer the command's outputs by setting cmd.Stdout = &bytes.Buffer{}
, the timeout no longer works, and the command runs forever.
Why does this happen?
Here is a minimal reproducible example:
package main
import (
"bytes"
"context"
"os/exec"
"time"
)
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond)
defer cancel()
cmdArgs := []string{"sh", "-c", "sleep infinity"}
bufferOutputs := true
// Uncommenting *either* of the next two lines will make the issue go away:
// cmdArgs = []string{"sleep", "infinity"}
// bufferOutputs = false
cmd := exec.CommandContext(ctx, cmdArgs[0], cmdArgs[1:]...)
if bufferOutputs {
cmd.Stdout = &bytes.Buffer{}
}
_ = cmd.Run()
}
I've tagged this question with Linux because I've only verified that this happens on Ubuntu 20.04 and I'm not sure whether it would reproduce on other platforms.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我的问题是,当上下文超时时,子睡眠进程没有被终止。
sh
父进程被终止,但子进程sleep
被保留。这通常仍然允许
cmd.Wait()
调用成功,但问题是cmd.Wait()
等待进程退出并且 用于复制输出。因为我们已经分配了cmd.Stdout
,所以我们必须等待sleep
进程的stdout管道的读端关闭,但它永远不会关闭,因为该进程是仍在运行。为了杀死子进程,我们可以通过设置
Setpgid
位来启动该进程作为其自己的进程组领导者,这将允许我们使用其负来杀死该进程> 用于终止进程以及任何子进程的 PID。这是我想出的 exec.CommandContext 的直接替代品,它的作用正是如此:
--- 更新 ---
自从编写这段代码以来,我遇到了子进程有时想要加入的情况他们自己的进程组,并且 setpgid 技巧不再起作用,因为它不会杀死这些新进程组中的进程。更可靠的解决方案可能是使用
go-ps
之类的工具手动遍历进程树,并且对于每个后代进程,使用以下伪代码:My issue was that the child
sleep
process was not being killed when the context timed out. Thesh
parent process was being killed, but the childsleep
was being left around.This would normally still allow the
cmd.Wait()
call to succeed, but the problem is thatcmd.Wait()
waits for both the process to exit and for outputs to be copied. Because we've assignedcmd.Stdout
, we have to wait for the read-end of thesleep
process' stdout pipe to close, but it never closes because the process is still running.In order to kill child processes, we can instead start the process as its own process group leader by setting the
Setpgid
bit, which will then allow us to kill the process using its negative PID to kill the process as well as any subprocesses.Here is a drop-in replacement for
exec.CommandContext
I came up with that does exactly this:--- UPDATE ---
Since writing this code I've run into cases where subprocesses sometimes want to join their own process groups, and the setpgid trick no longer works because it will not kill processes in those new process groups. A more robust solution might be to manually traverse the process tree using something like
go-ps
, and for each descendant process, use the following pseudocode:通过设置
cmd.waitdelay
我们可以确保我们可以确保即使IO管道未关闭,该过程也将被杀死。
这是在 GO 1.20 中引入的。
By setting
cmd.WaitDelay
we can make sure the process will be killed even if the io Pipes are not closed.This was introduced in go 1.20.