“shell=True”的实际含义在子流程中
我使用 subprocess
模块调用不同的进程。不过,我有一个问题。
在以下代码中:
callProcess = subprocess.Popen(['ls', '-l'], shell=True)
和
callProcess = subprocess.Popen(['ls', '-l']) # without shell
两者都有效。阅读文档后,我知道 shell=True
意味着通过 shell 执行代码。这意味着在缺席的情况下,该过程会直接启动。
那么对于我的情况我应该选择什么 - 我需要运行一个进程并获取其输出。从 shell 内部或外部调用它有什么好处?
I am calling different processes with the subprocess
module. However, I have a question.
In the following code:
callProcess = subprocess.Popen(['ls', '-l'], shell=True)
and
callProcess = subprocess.Popen(['ls', '-l']) # without shell
Both work. After reading the docs, I came to know that shell=True
means executing the code through the shell. So that means in absence, the process is directly started.
So what should I prefer for my case - I need to run a process and get its output. What benefit do I have from calling it from within the shell or outside of it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
不通过 shell 调用的好处是您不会调用“神秘程序”。在 POSIX 上,环境变量
SHELL
控制哪个二进制文件被作为“shell”调用。在 Windows 上,没有 bourne shell 后代,只有 cmd.exe。因此,调用 shell 会调用用户选择的程序,并且与平台相关。一般来说,避免通过 shell 进行调用。
通过 shell 调用确实允许您根据 shell 的常用机制扩展环境变量和文件全局变量。在 POSIX 系统上,shell 将文件全局扩展为文件列表。在 Windows 上,无论如何,shell 都不会扩展文件 glob(例如,“*.*”)(但命令行上的环境变量由 cmd.exe 扩展)。
如果您认为需要环境变量扩展和文件全局,请研究 1992 年左右针对通过 shell 执行子程序调用的网络服务的
ILS
攻击。示例包括涉及ILS
的各种sendmail
后门。总之,使用
shell=False
。The benefit of not calling via the shell is that you are not invoking a 'mystery program.' On POSIX, the environment variable
SHELL
controls which binary is invoked as the "shell." On Windows, there is no bourne shell descendent, only cmd.exe.So invoking the shell invokes a program of the user's choosing and is platform-dependent. Generally speaking, avoid invocations via the shell.
Invoking via the shell does allow you to expand environment variables and file globs according to the shell's usual mechanism. On POSIX systems, the shell expands file globs to a list of files. On Windows, a file glob (e.g., "*.*") is not expanded by the shell, anyway (but environment variables on a command line are expanded by cmd.exe).
If you think you want environment variable expansions and file globs, research the
ILS
attacks of 1992-ish on network services which performed subprogram invocations via the shell. Examples include the varioussendmail
backdoors involvingILS
.In summary, use
shell=False
.来源: 子流程模块
source: Subprocess Module
此处显示了 Shell=True 可能出现问题的示例
在此处查看文档: subprocess .call()
An example where things could go wrong with Shell=True is shown here
Check the doc here: subprocess.call()
通过 shell 执行程序意味着传递给程序的所有用户输入都会根据调用的 shell 的语法和语义规则进行解释。充其量,这只会给用户带来不便,因为用户必须遵守这些规则。例如,包含特殊 shell 字符(如引号或空格)的路径必须进行转义。最坏的情况是,它会导致安全漏洞,因为用户可以执行任意程序。
shell=True
有时可以方便地使用特定的 shell 功能,例如分词或参数扩展。但是,如果需要这样的功能,请使用为您提供的其他模块(例如用于参数扩展的 os.path.expandvars() 或用于分词的 shlex ) 。这意味着更多的工作,但可以避免其他问题。简而言之:无论如何都要避免
shell=True
。Executing programs through the shell means that all user input passed to the program is interpreted according to the syntax and semantic rules of the invoked shell. At best, this only causes inconvenience to the user, because the user has to obey these rules. For instance, paths containing special shell characters like quotation marks or blanks must be escaped. At worst, it causes security leaks, because the user can execute arbitrary programs.
shell=True
is sometimes convenient to make use of specific shell features like word splitting or parameter expansion. However, if such a feature is required, make use of other modules are given to you (e.g.os.path.expandvars()
for parameter expansion orshlex
for word splitting). This means more work, but avoids other problems.In short: Avoid
shell=True
by all means.这里的其他答案充分解释了
subprocess
文档中也提到的安全警告。但除此之外,启动 shell 来启动您想要运行的程序的开销通常是不必要的,并且对于您实际上不使用 shell 的任何功能的情况来说绝对是愚蠢的。此外,额外的隐藏复杂性会让您感到害怕,尤其如果您不太熟悉 shell 或其提供的服务。在与 shell 的交互非常重要的情况下,您现在需要 Python 脚本的读者和维护者(可能是也可能不是您未来的自己)来理解 Python 和 shell 脚本。请记住 Python 座右铭“显式优于隐式”;即使 Python 代码比等效(通常非常简洁)的 shell 脚本更复杂,您最好还是删除shell 并用本机 Python 结构替换功能。最大限度地减少外部流程中完成的工作并尽可能在自己的代码中保持控制通常是一个好主意,因为它提高了可见性并降低了(想要的或不需要的)副作用的风险。
通配符扩展、变量插值和重定向都很容易用本机 Python 结构替换。在复杂的 shell 管道中,部分或全部无法用 Python 合理重写的情况是您可能可以考虑使用 shell 的一种情况。您仍然应该确保您了解性能和安全影响。
在这种情况下,为了避免
shell=True
,只需替换为
注意第一个参数是如何传递给
execvp()
的字符串列表,以及如何引用字符串和反斜杠转义 shell 元字符通常是不必要的(或有用的,或正确的)。也许还可以看到 何时在 shell 变量周围加上引号?
如果您不想自己解决这个问题,
shlex.split()
函数可以为您完成此操作。它是 Python 标准库的一部分,但当然,如果您的 shell 命令字符串是静态的,您可以在开发过程中运行一次,然后将结果粘贴到脚本中。顺便说一句,如果
subprocess
包中的一个更简单的包装器满足您的要求,您通常希望避免Popen
。如果您有足够新的 Python,您可能应该使用subprocess。运行
。check=True
如果您运行的命令失败,它将失败。text=True
(或者有点模糊,使用同义词universal_newlines=True
),它将把输出解码为正确的 Unicode 字符串(它只是bytes
否则在系统编码中,在 Python 3 上)。如果没有,对于许多任务,您需要
check_output
获取命令的输出,同时检查命令是否成功,或check_call
如果没有要收集的输出。我将引用 David Korn 的一句话来结束:“编写可移植 shell 比编写可移植 shell 脚本更容易。”甚至
subprocess.run('echo "$HOME"', shell=True)
也无法移植到 Windows。The other answers here adequately explain the security caveats which are also mentioned in the
subprocess
documentation. But in addition to that, the overhead of starting a shell to start the program you want to run is often unnecessary and definitely silly for situations where you don't actually use any of the shell's functionality. Moreover, the additional hidden complexity should scare you, especially if you are not very familiar with the shell or the services it provides.Where the interactions with the shell are nontrivial, you now require the reader and maintainer of the Python script (which may or may not be your future self) to understand both Python and shell script. Remember the Python motto "explicit is better than implicit"; even when the Python code is going to be somewhat more complex than the equivalent (and often very terse) shell script, you might be better off removing the shell and replacing the functionality with native Python constructs. Minimizing the work done in an external process and keeping control within your own code as far as possible is often a good idea simply because it improves visibility and reduces the risks of -- wanted or unwanted -- side effects.
Wildcard expansion, variable interpolation, and redirection are all simple to replace with native Python constructs. A complex shell pipeline where parts or all cannot be reasonably rewritten in Python would be the one situation where perhaps you could consider using the shell. You should still make sure you understand the performance and security implications.
In the trivial case, to avoid
shell=True
, simply replacewith
Notice how the first argument is a list of strings to pass to
execvp()
, and how quoting strings and backslash-escaping shell metacharacters is generally not necessary (or useful, or correct).Maybe see also When to wrap quotes around a shell variable?
If you don't want to figure this out yourself, the
shlex.split()
function can do this for you. It's part of the Python standard library, but of course, if your shell command string is static, you can just run it once, during development, and paste the result into your script.As an aside, you very often want to avoid
Popen
if one of the simpler wrappers in thesubprocess
package does what you want. If you have a recent enough Python, you should probably usesubprocess.run
.check=True
it will fail if the command you ran failed.stdout=subprocess.PIPE
it will capture the command's output.text=True
(or somewhat obscurely, with the synonymuniversal_newlines=True
) it will decode output into a proper Unicode string (it's justbytes
in the system encoding otherwise, on Python 3).If not, for many tasks, you want
check_output
to obtain the output from a command, whilst checking that it succeeded, orcheck_call
if there is no output to collect.I'll close with a quote from David Korn: "It's easier to write a portable shell than a portable shell script." Even
subprocess.run('echo "$HOME"', shell=True)
is not portable to Windows.上面的 Anwser 解释得正确,但不够直接。
让我们使用
ps
命令看看会发生什么。运行它,并显示
您可以使用 ps -auxf > 1 在
finish
之前,然后ps -auxf > 2
完成
后。这是输出1
看到了吗?而不是直接运行
sleep 100
。它实际上运行/bin/sh
。而它打印出来的pid
实际上是/bin/sh
的pid
。如果您调用s.kill()
后,它会杀死/bin/sh
但sleep
仍然存在。2
那么下一个问题是,
/bin/sh
能做什么?每个 Linux 用户都知道它、听过它、使用过它。但我敢打赌,确实有很多人并不真正理解shell
是什么。也许您还听说过/bin/bash
,它们很相似。shell的一个显着功能就是方便用户运行linux应用程序。由于有
sh
或bash
等shell程序,您可以直接使用ls
等命令,而不是/usr/bin/ls
代码>.它会搜索 ls 所在位置并为您运行它。另一个功能是将
$
后面的字符串解释为环境变量。你可以比较这两个Python脚本来自己找出答案。最重要的是,它可以将 Linux 命令作为脚本运行。如
if
else
都是shell引入的。这不是原生 Linux 命令Anwser above explains it correctly, but not straight enough.
Let use
ps
command to see what happens.Run it, and shows
You can then use
ps -auxf > 1
beforefinish
, and thenps -auxf > 2
afterfinish
. Here is the output1
See? Instead of directly running
sleep 100
. it actually runs/bin/sh
. and thepid
it prints out is actually thepid
of/bin/sh
. After if you calls.kill()
, it kills/bin/sh
butsleep
is still there.2
So the next question is , what can
/bin/sh
do? Every linux user knows it, heard it, and uses it. But i bet there are so many people who doesn't really understand what isshell
indeed. Maybe you also hear/bin/bash
, they're similar.One obvious function of shell is for users convenience to run linux application. because of shell programm like
sh
orbash
, you can directly use command likels
rather than/usr/bin/ls
. it will search wherels
is and runs it for you.Other function is it will interpret string after
$
as environment variable. You can compare these two python script to findout yourself.And the most important, it makes possible to run linux command as script. Such as
if
else
are introduced by shell. it's not native linux command假设您使用 shell=False 并以列表形式提供命令。一些恶意用户尝试注入“rm”命令。
您将看到,“rm”将被解释为参数,并且实际上“ls”将尝试查找名为“rm”的文件。
如果您没有正确控制输入,默认情况下 shell=False 不是安全的。您仍然可以执行危险的命令。
我在容器环境中编写大部分应用程序,我知道正在调用哪个 shell,并且我不接受任何用户输入。
因此,在我的用例中,我认为没有安全风险。创建长串命令要容易得多。希望我没有说错。
let's assume you are using shell=False and providing the command as a list. And some malicious user tried injecting an 'rm' command.
You will see, that 'rm' will be interpreted as an argument and effectively 'ls' will try to find a file called 'rm'
shell=False is not a secure by default, if you don't control the input properly. You can still execute dangerous commands.
I am writing most of my applications in container environments, I know which shell is being invoked and i am not taking any user input.
So in my use case, I see no security risk. And it is much easier creating long string of commands. Hope I am not wrong.