当前位置：文江博客话题详情

#! 怎么办？谢邦工作？

发布于 2024-09-05 17:59:27 字数 216 浏览 13 评论 0原文

在脚本中，第一行必须包含#!，后跟将执行该脚本的程序的路径（例如：sh、perl）。

据我所知，# 字符表示注释的开始，执行脚本的程序应该忽略该行。看起来，第一行在某个时刻被某些东西读取，以便脚本由正确的程序执行。

有人可以进一步说明 #! 的工作原理吗？

我对此很好奇，所以答案越深入越好。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一个人的夜不怕黑 2024-09-12 17:59:27

推荐阅读：

unix 内核的程序加载器负责执行此操作。当调用 exec() 时，它会要求内核从其参数处的文件加载程序。然后它会检查文件的前 16 位，看看它的可执行格式是什么。如果它发现这些位是#!，它将使用文件第一行的其余部分来查找应该启动哪个程序，并提供它尝试启动的文件的名称（脚本）作为解释程序的最后一个参数。

然后解释器正常运行，并将 #! 视为注释行。

回复收藏 0 原文

痴情 2024-09-12 17:59:27

Linux 内核 exec 系统调用使用初始字节 #! 来标识文件类型

当您在 bash 上执行此操作时：

./something

在 Linux 上，这会调用 < code>exec 系统调用，路径为 ./something。

此行在传递给 exec 的文件上的内核中被调用：https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_script.c#L25

if ((bprm->buf[0] != '#') || (bprm->buf[1] != '!'))

它读取文件的第一个字节，并比较它们到#!。

如果比较结果为 true，则该行的其余部分将由 Linux 内核解析，这将使用路径 /usr/bin/env python 和当前文件进行另一个 exec 调用作为第一个参数：

/usr/bin/env python /path/to/script.py

这适用于任何使用 # 作为注释字符的脚本语言。

是的，您可以使用以下命令进行无限循环：

printf '#!/a\n' | sudo tee /a
sudo chmod +x /a
/a

Bash 识别错误：

-bash: /a: /a: bad interpreter: Too many levels of symbolic links

#! 是人类可读的，但这不是必需的。

如果文件以不同的字节开头，则 exec 系统调用将使用不同的处理程序。另一个最重要的内置处理程序是针对 ELF 可执行文件的： https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_elf.c#L1305 检查字节 7f 45 4c 46 （这也恰好发生在对于 .ELF 来说是人类可读的）。让我们通过读取 /bin/ls 的前 4 个字节来确认这一点，这是一个 ELF 可执行文件：

head -c 4 "$(which ls)" | hd

输出：

00000000  7f 45 4c 46                                       |.ELF|
00000004

因此，当内核看到这些字节时，它会获取 ELF 文件，将其正确放入内存中，并用它开始一个新的进程。另请参阅：内核如何获得在 linux 下运行的可执行二进制文件？

最后，您可以使用 binfmt_misc 机制添加自己的 shebang 处理程序。例如，您可以为 .jar 添加自定义处理程序< /code> 文件。该机制甚至支持按文件扩展名进行处理。另一个应用程序是透明地运行以下可执行文件：使用 QEMU 的不同架构。

我不认为 POSIX 指定了 shebangs 但是：https://unix.stackexchange.com/a/346214/32558 ，尽管它在基本原理部分中确实提到了，并且以“如果系统支持可执行脚本，则可能会发生某些事情”的形式。然而 macOS 和 FreeBSD 似乎也实现了它。

The Linux kernel exec system call uses the initial bytes #! to identify file type

When you do on bash:

./something

on Linux, this calls the exec system call with the path ./something.

This line gets called in the kernel on the file passed to exec: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_script.c#L25

if ((bprm->buf[0] != '#') || (bprm->buf[1] != '!'))

It reads the very first bytes of the file, and compares them to #!.

If the comparison is true, then the rest of the line is parsed by the Linux kernel, which makes another exec call with path /usr/bin/env python and current file as the first argument:

/usr/bin/env python /path/to/script.py

and this works for any scripting language that uses # as a comment character.

And yes, you can make an infinite loop with:

printf '#!/a\n' | sudo tee /a
sudo chmod +x /a
/a

Bash recognizes the error:

-bash: /a: /a: bad interpreter: Too many levels of symbolic links

#! is human readable, but that is not necessary.

If the file started with different bytes, then the exec system call would use a different handler. The other most important built-in handler is for ELF executable files: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_elf.c#L1305 which checks for bytes 7f 45 4c 46 (which also happens to be human readable for .ELF). Let's confirm that by reading the 4 first bytes of /bin/ls, which is an ELF executable:

head -c 4 "$(which ls)" | hd

output:

00000000  7f 45 4c 46                                       |.ELF|
00000004

So when the kernel sees those bytes, it takes the ELF file, puts it into memory correctly, and starts a new process with it. See also: How does kernel get an executable binary file running under linux?

Finally, you can add your own shebang handlers with the binfmt_misc mechanism. For example, you can add a custom handler for .jar files. This mechanism even supports handlers by file extension. Another application is to transparently run executables of a different architecture with QEMU.

I don't think POSIX specifies shebangs however: https://unix.stackexchange.com/a/346214/32558 , although it does mention in on rationale sections, and in the form "if executable scripts are supported by the system something may happen". macOS and FreeBSD also seem to implement it however.

回复收藏 0 原文

意中人 2024-09-12 17:59:27

短篇故事： shebang (#!) 行由 ~~shell 读取（例如 sh、bash 等）~~ 操作系统的程序加载器。虽然它在形式上看起来像注释，但事实上它是文件的前两个字节，这将整个文件标记为文本文件和脚本。该脚本将传递给 shebang 之后第一行提到的可执行文件。瞧！

稍微长一点的故事：想象一下您的脚本foo.sh，设置了可执行位(x)。该文件包含以下内容：

#!/bin/sh

# some script commands follow...:
# *snip*

现在，在您的 shell 上输入：

> ./foo.sh

编辑：请在阅读以下内容之后或之前阅读以下评论！事实证明，我错了。显然不是 shell 将脚本传递给目标解释器，而是操作系统（内核）本身。

请记住，您在 shell 进程中键入此内容（假设这是程序 /bin/sh ）。因此，该输入必须由该程序处理。它将这一行解释为命令，因为它发现该行输入的第一件事是实际存在且设置了可执行位的文件的名称。

然后，/bin/sh 开始读取文件内容，并在文件的最开头发现 shebang (#!)。对于 shell 来说，这是一个标记（“幻数”），通过它它知道该文件包含脚本。

现在，它如何知道脚本是用哪种编程语言编写的呢？毕竟，您可以执行 Bash 脚本、Perl 脚本、Python 脚本……到目前为止，shell 所知道的只是它正在查看脚本文件（不是二进制文件，而是文本文件）。因此，它读取下一个输入，直到第一个换行符（这将导致 /bin/sh，与上面的进行比较）。这是脚本将传递给其执行的解释器。（在这种特殊情况下，目标解释器是 shell 本身，因此它不必为脚本调用新的 shell；它只需处理脚本文件本身的其余部分。）

如果脚本的目标是例如 /bin/perl，Perl 解释器（可选）所要做的就是查看 shebang 行是否真的提到了 Perl 解释器。如果不是，Perl 解释器就会知道它无法执行该脚本。如果 shebang 行中确实提到了 Perl 解释器，它会读取脚本文件的其余部分并执行它。

Short story: The shebang (#!) line is read by ~~the shell (e.g. sh, bash, etc.)~~ the operating system's program loader. While it formally looks like a comment, the fact that it's the very first two bytes of a file marks the whole file as a text file and as a script. The script will be passed to the executable mentioned on the first line after the shebang. Voilà!

Slightly longer story: Imagine you have your script, foo.sh, with the executable bit (x) set. This file contains e.g. the following:

#!/bin/sh

# some script commands follow...:
# *snip*

Now, on your shell, you type:

> ./foo.sh

Edit: Please also read the comments below after or before you read the following! As it turns out, I was mistaken. It's apparently not the shell that passes the script to the target interpreter, but the operating system (kernel) itself.

Remember that you type this inside the shell process (let's assume this is the program /bin/sh). Therefore, that input will have to be processed by that program. It interprets this line as a command, since it discovers that the very first thing entered on the line is the name of a file that actually exists and which has the executable bit(s) set.

/bin/sh then starts reading the file's contents and discovers the shebang (#!) right at the very beginning of the file. To the shell, this is a token ("magic number") by which it knows that the file contains a script.

Now, how does it know which programming language the script is written it? After all, you can execute Bash scripts, Perl scripts, Python scripts, ... All the shell knows so far is that it is looking at a script file (which is not a binary file, but a text file). Thus it reads the next input up to the first line break (which will result in /bin/sh, compare with the above). This is the interpreter to which the script will be passed for execution. (In this particular case, the target interpreter is the shell itself, so it doesn't have to invoke a new shell for the script; it simply processes the rest of the script file itself.)

If the script was destined for e.g. /bin/perl, all that the Perl interpreter would (optionally) have to do is look whether the shebang line really mentions the Perl interpreter. If not, the Perl interpreter would know that it cannot execute this script. If indeed the Perl interpreter is mentioned in the shebang line, it reads the rest of the script file and executes it.

回复收藏 0 原文

~没有更多了~