bash解析多板命令如何?

发布于 2025-01-29 06:44:57 字数 538 浏览 2 评论 0 原文

我正在尝试创建一个过于简化的bash版本,我尝试将程序拆分为“ Lexer + Expander,Parser,executor”。
在Lexer中,我存储了我的数据(命令,标志,文件)并从中创建令牌,我的过程仅仅是通过char循环循环,并使用状态机处理状态,要么是一个特殊的角色,要么是一个特殊字符,字母数字或空间。
现在,当我处于字母数字状态时,我处于命令中,我知道下一个标志在我再次遇到字母数字状态时的方式或 input [i] ==' - ' - ',现在问题是使用多型列命令。 例如:

$ ls -la | grep "*.c"

我成功获得命令 ls,grep 和flag -la, *.c 。 但是,使用多型命令命令。

$ sed -i "*.bak" "s/a/b/g" file1 file2

在我看来,这似乎很困难,我还不知道,我怎么能知道特定命令的旗帜在哪里结束,所以我的问题是如何抨击这些多flags命令?关于我的问题的任何建议,将不胜感激!

I'm trying to create an overly simplified version of bash, I've tried split the program into "lexer + expander, parser, executor".
In the lexer i store my data (commands, flags, files) and create tokens out of them , my procedure is simply to loop through given input char by char and use a state machine to handle states, states are either a special character, an alphanumeric character or space.
Now when i'm at an alphanumeric state i'm at a command, the way i know where the next flag is when i encounter again alphanumeric state or if input[i] == '-', now the problem is with multi-flag commands.
For example:

$ ls -la | grep "*.c"

I successfully get the command ls, grep and the flag -la, *.c.
However with multi-flag commands like.

$ sed -i "*.bak" "s/a/b/g" file1 file2

It seems to me very difficult, and i can't figure out yet, how can i know where the flags to a specific command ends, so my question is how bash parse these multi-flags commands ? any suggestions regarding my problem, would be appreciated !

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

╰つ倒转 2025-02-05 06:44:57

外壳不会试图解析命令参数;这就是公用事业的责任。在使用和潜在有用的可能命令参数语法的范围太大了,无法尝试。

在类似于Unix的系统上,Shell从命令行中标识了单个参数,主要是通过在空格上分裂,但也考虑了引号的使用和各种其他转换,例如“ Glob Explivation”。然后,它使这些参数的向量(“ argv”)将向量传递给 execve ,将它们传递到新创建的过程中。

在Windows系统上,Shell甚至没有这样做。它只是将命令行作为字符串交给命令行,然后将其放在命令行工具中以完成所有操作。 (为了提供一定的兼容性,有一个中间层,该层由应用程序初始化代码调用,该代码最终调用 main()。这确实是一些基本的参数 - 拆分,尽管其引用算法是 从而简化了一些。

从Unix Shell使用的命令行壳中,我知道尝试识别命令行标志的尝试, 你也不应该。

对于一些课外阅读,以下是POSIX标准中的外壳解析的描述:。试图实施所有这些远远超出了您对这项任务的要求,我当然不建议您这样做。但这可能仍然很有趣,如果您开始使用外壳,则了解它会极大地帮助您。

另外,您可以尝试阅读,这可能更容易理解。请注意,BASH将许多扩展应用于POSIX标准。

The shell does not attempt to parse command arguments; that's the responsibility of the utility. The range of possible command argument syntaxes, both in use and potentially useful, is far too great to attempt that.

On Unix-like systems, the shell identifies individual arguments from the command line, mostly by splitting at whitespace but also taking into account the use of quotes and a variety of other transformations, such as "glob expansion". It then makes a vector of these arguments ("argv") and passes the vector to execve, which hands them to the newly created process.

On Windows systems, the shell doesn't even do that. It just hands over the command-line as a string, and leaves it to the command-line tool to do everything. (In order to provide a modicum of compatibility, there's an intermediate layer which is called by the application initialization code, which eventually calls main(). This does some basic argument-splitting, although its quoting algorithm is quite a bit simplified from that used by a Unix shell.)

No command-line shell that I know of attempts to identify command-line flags. And neither should you.

For a bit of extracurricular reading, here's the description of shell parsing from the Posix standard: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html. Trying to implement all that goes far beyond the requirements given to you for this assignment, and I'm certainly not recommending that you do that. But it might still be interesting, and understanding it will help you immensely if you start using a shell.

Alternatively, you could try reading the Bash manual, which might be easier to understand. Note that Bash implements a lot of extensions to the Posix standard.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文