为什么 shell=True 的 subprocess.Popen() 在 Linux 和 Windows 上的工作方式不同?

发布于 2024-07-30 09:07:30 字数 794 浏览 6 评论 0原文

当使用 subprocess.Popen(args, shell=True) 运行“gcc --version”(仅作为示例)时,在 Windows 上我们得到:

>>> from subprocess import Popen
>>> Popen(['gcc', '--version'], shell=True)
gcc (GCC) 3.4.5 (mingw-vista special r3) ...

所以它很好按我的预期打印出版本。 但在 Linux 上我们得到这样的结果:

>>> from subprocess import Popen
>>> Popen(['gcc', '--version'], shell=True)
gcc: no input files

因为 gcc 还没有收到 --version 选项。

文档没有具体指定 Windows 下的 args 应该发生什么,但它确实说,在 Unix 上,“如果 args 是一个序列,则第一项指定命令字符串,任何其他项将被视为额外的 shell 参数。” 恕我直言,Windows 方式更好,因为它允许您将 Popen(arglist) 调用视为与 Popen(arglist, shell=True) 相同的方式代码>个。

为什么 Windows 和 Linux 之间存在差异?

When using subprocess.Popen(args, shell=True) to run "gcc --version" (just as an example), on Windows we get this:

>>> from subprocess import Popen
>>> Popen(['gcc', '--version'], shell=True)
gcc (GCC) 3.4.5 (mingw-vista special r3) ...

So it's nicely printing out the version as I expect. But on Linux we get this:

>>> from subprocess import Popen
>>> Popen(['gcc', '--version'], shell=True)
gcc: no input files

Because gcc hasn't received the --version option.

The docs don't specify exactly what should happen to the args under Windows, but it does say, on Unix, "If args is a sequence, the first item specifies the command string, and any additional items will be treated as additional shell arguments." IMHO the Windows way is better, because it allows you to treat Popen(arglist) calls the same as Popen(arglist, shell=True) ones.

Why the difference between Windows and Linux here?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

橪书 2024-08-06 09:07:31

实际上在 Windows 上,当 shell=True 时,它确实使用 cmd.exe - 它在前面加上 cmd.exe /c (它实际上查找 < code>COMSPEC 环境变量,但默认为 cmd.exe(如果不存在))到 shell 参数。 (在 Windows 95/98 上,它使用中间 w9xpopen 程序来实际启动该命令)。

因此,奇怪的实现实际上是 UNIX 实现,它执行以下操作(其中每个空格分隔不同的参数):

/bin/sh -c gcc --version

看起来正确的实现(至少在 Linux 上)是:

/bin/sh -c "gcc --version" gcc --version

因为这会设置从引用的参数中获取命令字符串,并成功传递其他参数。

来自 sh 手册页中 -c 的部分:

从 command_string 操作数而不是从标准输入读取命令。 特殊参数 0 将从 command_name 操作数设置,位置参数($1、$2 等)从其余参数操作数设置。

这个补丁似乎相当简单地实现了这一点:

--- subprocess.py.orig  2009-04-19 04:43:42.000000000 +0200
+++ subprocess.py       2009-08-10 13:08:48.000000000 +0200
@@ -990,7 +990,7 @@
                 args = list(args)

             if shell:
-                args = ["/bin/sh", "-c"] + args
+                args = ["/bin/sh", "-c"] + [" ".join(args)] + args

             if executable is None:
                 executable = args[0]

Actually on Windows, it does use cmd.exe when shell=True - it prepends cmd.exe /c (it actually looks up the COMSPEC environment variable but defaults to cmd.exe if not present) to the shell arguments. (On Windows 95/98 it uses the intermediate w9xpopen program to actually launch the command).

So the strange implementation is actually the UNIX one, which does the following (where each space separates a different argument):

/bin/sh -c gcc --version

It looks like the correct implementation (at least on Linux) would be:

/bin/sh -c "gcc --version" gcc --version

Since this would set the command string from the quoted parameters, and pass the other parameters successfully.

From the sh man page section for -c:

Read commands from the command_string operand instead of from the standard input. Special parameter 0 will be set from the command_name operand and the positional parameters ($1, $2, etc.) set from the remaining argument operands.

This patch seems to fairly simply do the trick:

--- subprocess.py.orig  2009-04-19 04:43:42.000000000 +0200
+++ subprocess.py       2009-08-10 13:08:48.000000000 +0200
@@ -990,7 +990,7 @@
                 args = list(args)

             if shell:
-                args = ["/bin/sh", "-c"] + args
+                args = ["/bin/sh", "-c"] + [" ".join(args)] + args

             if executable is None:
                 executable = args[0]
小女人ら 2024-08-06 09:07:31

来自 subprocess.py 源:

在 UNIX 上,shell=True:如果 args 是字符串,则它指定
通过 shell 执行的命令字符串。 如果 args 是一个序列,
第一项指定命令字符串,以及任何其他项目
将被视为额外的 shell 参数。

在 Windows 上:Popen 类使用 CreateProcess() 来执行子进程
程序,对字符串进行操作。 如果 args 是一个序列,它将是
使用 list2cmdline 方法转换为字符串。 请注意
并非所有 MS Windows 应用程序都以相同的方式解释命令行
方式:list2cmdline 是为使用相同的应用程序而设计的
规则作为 MS C 运行时。

这并没有回答为什么,只是澄清了您正在看到预期的行为。

“原因”可能是在类 UNIX 系统上,命令参数实际上以字符串数组的形式传递给应用程序(使用 exec* 调用系列)。 换句话说,调用进程决定每个命令行参数的内容。 然而,当您告诉它使用 shell 时,调用进程实际上只有机会将单个命令行参数传递给 shell 来执行:您想要执行的整个命令行、可执行文件名称和参数,作为单个字符串。

但在 Windows 上,整个命令行(根据上述文档)作为单个字符串传递给子进程。 如果您查看 CreateProcess API 文档,您会注意到它期望所有命令行参数连接在一起形成一个大字符串(因此调用 list2cmdline)。

另外,事实上,在类 UNIX 系统上,实际上有一个 shell 可以做有用的事情,所以我怀疑造成差异的另一个原因是在 Windows 上,shell=True 不执行任何操作,这就是为什么它按照您所看到的方式工作。 使两个系统表现相同的唯一方法是在 Windows 上当 shell=True 时简单地删除所有命令行参数。

From the subprocess.py source:

On UNIX, with shell=True: If args is a string, it specifies the
command string to execute through the shell. If args is a sequence,
the first item specifies the command string, and any additional items
will be treated as additional shell arguments.

On Windows: the Popen class uses CreateProcess() to execute the child
program, which operates on strings. If args is a sequence, it will be
converted to a string using the list2cmdline method. Please note that
not all MS Windows applications interpret the command line the same
way: The list2cmdline is designed for applications using the same
rules as the MS C runtime.

That doesn't answer why, just clarifies that you are seeing the expected behavior.

The "why" is probably that on UNIX-like systems, command arguments are actually passed through to applications (using the exec* family of calls) as an array of strings. In other words, the calling process decides what goes into EACH command line argument. Whereas when you tell it to use a shell, the calling process actually only gets the chance to pass a single command line argument to the shell to execute: The entire command line that you want executed, executable name and arguments, as a single string.

But on Windows, the entire command line (according to the above documentation) is passed as a single string to the child process. If you look at the CreateProcess API documentation, you will notice that it expects all of the command line arguments to be concatenated together into a big string (hence the call to list2cmdline).

Plus there is the fact that on UNIX-like systems there actually is a shell that can do useful things, so I suspect that the other reason for the difference is that on Windows, shell=True does nothing, which is why it is working the way you are seeing. The only way to make the two systems act identically would be for it to simply drop all of the command line arguments when shell=True on Windows.

耀眼的星火 2024-08-06 09:07:31

shell=True 的 UNIX 行为的原因与引用有关。 当我们编写shell命令时,它会被空格分割,因此我们必须引用一些参数:

cp "My File" "New Location"

当我们的参数包含引号时,这会导致问题,这需要转义:

grep -r "\"hello\"" .

有时我们可以得到可怕的情况,其中 \ 也必须转义!

当然,真正的问题是我们尝试使用一个字符串来指定多个字符串。 当调用系统命令时,大多数编程语言通过允许我们首先发送多个字符串来避免这种情况,因此:

Popen(['cp', 'My File', 'New Location'])
Popen(['grep', '-r', '"hello"'])

有时运行“原始”shell 命令可能会很好; 例如,如果我们从 shell 脚本或网站复制粘贴某些内容,并且我们不想手动转换所有可怕的转义。 这就是 shell=True 选项存在的原因:

Popen(['cp "My File" "New Location"'], shell=True)
Popen(['grep -r "\"hello\"" .'], shell=True)

我不熟悉 Windows,所以我不知道它的行为方式如何或为何不同。

The reason for the UNIX behaviour of shell=True is to do with quoting. When we write a shell command, it will be split at spaces, so we have to quote some arguments:

cp "My File" "New Location"

This leads to problems when our arguments contain quotes, which requires escaping:

grep -r "\"hello\"" .

Sometimes we can get awful situations where \ must be escaped too!

Of course, the real problem is that we're trying to use one string to specify multiple strings. When calling system commands, most programming languages avoid this by allowing us to send multiple strings in the first place, hence:

Popen(['cp', 'My File', 'New Location'])
Popen(['grep', '-r', '"hello"'])

Sometimes it can be nice to run "raw" shell commands; for example, if we're copy-pasting something from a shell script or a Web site, and we don't want to convert all of the horrible escaping manually. That's why the shell=True option exists:

Popen(['cp "My File" "New Location"'], shell=True)
Popen(['grep -r "\"hello\"" .'], shell=True)

I'm not familiar with Windows so I don't know how or why it behaves differently.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文