#! 怎么办?谢邦工作?
在脚本中,第一行必须包含#!
,后跟将执行该脚本的程序的路径(例如:sh、perl)。
据我所知,#
字符表示注释的开始,执行脚本的程序应该忽略该行。看起来,第一行在某个时刻被某些东西读取,以便脚本由正确的程序执行。
有人可以进一步说明 #!
的工作原理吗?
我对此很好奇,所以答案越深入越好。
In a script you must include a #!
on the first line followed by the path to the program that will execute the script (e.g.: sh, perl).
As far as I know, the #
character denotes the start of a comment and that line is supposed to be ignored by the program executing the script. It would seem, that this first line is at some point read by something in order for the script to be executed by the proper program.
Could somebody please shed more light on the workings of the #!
?
I'm really curious about this, so the more in-depth the answer the better.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
推荐阅读:
unix 内核的程序加载器负责执行此操作。当调用 exec() 时,它会要求内核从其参数处的文件加载程序。然后它会检查文件的前 16 位,看看它的可执行格式是什么。如果它发现这些位是
#!
,它将使用文件第一行的其余部分来查找应该启动哪个程序,并提供它尝试启动的文件的名称(脚本)作为解释程序的最后一个参数。然后解释器正常运行,并将
#!
视为注释行。Recommended reading:
The unix kernel's program loader is responsible for doing this. When
exec()
is called, it asks the kernel to load the program from the file at its argument. It will then check the first 16 bits of the file to see what executable format it has. If it finds that these bits are#!
it will use the rest of the first line of the file to find which program it should launch, and it provides the name of the file it was trying to launch (the script) as the last argument to the interpreter program.The interpreter then runs as normal, and treats the
#!
as a comment line.Linux 内核
exec
系统调用使用初始字节#!
来标识文件类型当您在 bash 上执行此操作时:
在 Linux 上,这会调用 < code>exec 系统调用,路径为
./something
。此行在传递给
exec
的文件上的内核中被调用:https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_script.c#L25它读取文件的第一个字节,并比较它们到
#!
。如果比较结果为 true,则该行的其余部分将由 Linux 内核解析,这将使用路径
/usr/bin/env python
和当前文件进行另一个exec
调用作为第一个参数:这适用于任何使用
#
作为注释字符的脚本语言。是的,您可以使用以下命令进行无限循环:
Bash 识别错误:
#!
是人类可读的,但这不是必需的。如果文件以不同的字节开头,则 exec 系统调用将使用不同的处理程序。另一个最重要的内置处理程序是针对 ELF 可执行文件的: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_elf.c#L1305 检查字节
7f 45 4c 46
(这也恰好发生在对于.ELF
来说是人类可读的)。让我们通过读取/bin/ls
的前 4 个字节来确认这一点,这是一个 ELF 可执行文件:输出:
因此,当内核看到这些字节时,它会获取 ELF 文件,将其正确放入内存中,并用它开始一个新的进程。另请参阅:内核如何获得在 linux 下运行的可执行二进制文件?
最后,您可以使用
binfmt_misc
机制添加自己的 shebang 处理程序。例如,您可以为.jar 添加 自定义处理程序< /code> 文件。该机制甚至支持按文件扩展名进行处理。另一个应用程序是透明地运行以下可执行文件:使用 QEMU 的不同架构。
我不认为 POSIX 指定了 shebangs 但是:https://unix.stackexchange.com/a/346214/32558 ,尽管它在基本原理部分中确实提到了,并且以“如果系统支持可执行脚本,则可能会发生某些事情”的形式。然而 macOS 和 FreeBSD 似乎也实现了它。
The Linux kernel
exec
system call uses the initial bytes#!
to identify file typeWhen you do on bash:
on Linux, this calls the
exec
system call with the path./something
.This line gets called in the kernel on the file passed to
exec
: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_script.c#L25It reads the very first bytes of the file, and compares them to
#!
.If the comparison is true, then the rest of the line is parsed by the Linux kernel, which makes another
exec
call with path/usr/bin/env python
and current file as the first argument:and this works for any scripting language that uses
#
as a comment character.And yes, you can make an infinite loop with:
Bash recognizes the error:
#!
is human readable, but that is not necessary.If the file started with different bytes, then the
exec
system call would use a different handler. The other most important built-in handler is for ELF executable files: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_elf.c#L1305 which checks for bytes7f 45 4c 46
(which also happens to be human readable for.ELF
). Let's confirm that by reading the 4 first bytes of/bin/ls
, which is an ELF executable:output:
So when the kernel sees those bytes, it takes the ELF file, puts it into memory correctly, and starts a new process with it. See also: How does kernel get an executable binary file running under linux?
Finally, you can add your own shebang handlers with the
binfmt_misc
mechanism. For example, you can add a custom handler for.jar
files. This mechanism even supports handlers by file extension. Another application is to transparently run executables of a different architecture with QEMU.I don't think POSIX specifies shebangs however: https://unix.stackexchange.com/a/346214/32558 , although it does mention in on rationale sections, and in the form "if executable scripts are supported by the system something may happen". macOS and FreeBSD also seem to implement it however.
短篇故事: shebang (
#!
) 行由shell 读取(例如操作系统的程序加载器。虽然它在形式上看起来像注释,但事实上它是文件的前两个字节,这将整个文件标记为文本文件和脚本。该脚本将传递给 shebang 之后第一行提到的可执行文件。瞧!sh
、bash
等)稍微长一点的故事:想象一下您的脚本
foo.sh
,设置了可执行位(x
)。该文件包含以下内容:现在,在您的 shell 上输入:
请记住,您在 shell 进程中键入此内容(假设这是程序
/bin/sh )。因此,该输入必须由该程序处理。它将这一行解释为命令,因为它发现该行输入的第一件事是实际存在且设置了可执行位的文件的名称。
然后,
/bin/sh
开始读取文件内容,并在文件的最开头发现 shebang (#!
)。对于 shell 来说,这是一个标记(“幻数”),通过它它知道该文件包含脚本。现在,它如何知道脚本是用哪种编程语言编写的呢?毕竟,您可以执行 Bash 脚本、Perl 脚本、Python 脚本……到目前为止,shell 所知道的只是它正在查看脚本文件(不是二进制文件,而是文本文件)。因此,它读取下一个输入,直到第一个换行符(这将导致
/bin/sh
,与上面的进行比较)。这是脚本将传递给其执行的解释器。 (在这种特殊情况下,目标解释器是 shell 本身,因此它不必为脚本调用新的 shell;它只需处理脚本文件本身的其余部分。)如果脚本的目标是例如
/bin/perl
,Perl 解释器(可选)所要做的就是查看 shebang 行是否真的提到了 Perl 解释器。如果不是,Perl 解释器就会知道它无法执行该脚本。如果 shebang 行中确实提到了 Perl 解释器,它会读取脚本文件的其余部分并执行它。Short story: The shebang (
#!
) line is read bythe shell (e.g.the operating system's program loader. While it formally looks like a comment, the fact that it's the very first two bytes of a file marks the whole file as a text file and as a script. The script will be passed to the executable mentioned on the first line after the shebang. Voilà!sh
,bash
, etc.)Slightly longer story: Imagine you have your script,
foo.sh
, with the executable bit (x
) set. This file contains e.g. the following:Now, on your shell, you type:
Remember that you type this inside the shell process (let's assume this is the program
/bin/sh
). Therefore, that input will have to be processed by that program. It interprets this line as a command, since it discovers that the very first thing entered on the line is the name of a file that actually exists and which has the executable bit(s) set./bin/sh
then starts reading the file's contents and discovers the shebang (#!
) right at the very beginning of the file. To the shell, this is a token ("magic number") by which it knows that the file contains a script.Now, how does it know which programming language the script is written it? After all, you can execute Bash scripts, Perl scripts, Python scripts, ... All the shell knows so far is that it is looking at a script file (which is not a binary file, but a text file). Thus it reads the next input up to the first line break (which will result in
/bin/sh
, compare with the above). This is the interpreter to which the script will be passed for execution. (In this particular case, the target interpreter is the shell itself, so it doesn't have to invoke a new shell for the script; it simply processes the rest of the script file itself.)If the script was destined for e.g.
/bin/perl
, all that the Perl interpreter would (optionally) have to do is look whether the shebang line really mentions the Perl interpreter. If not, the Perl interpreter would know that it cannot execute this script. If indeed the Perl interpreter is mentioned in the shebang line, it reads the rest of the script file and executes it.