控制 C 或 C 中的 shell 命令行通配符扩展++
我正在用 C++ 编写一个程序 foo。它通常在命令行上调用,如下所示:
foo *.txt
我的 main()
以正常方式接收参数。在许多系统上,argv[1]
实际上是 *.txt
,我必须调用系统例程来进行通配符扩展。然而,在 Unix 系统上,shell 在调用我的程序之前会扩展通配符,并且所有匹配的文件名都将位于 argv 中。
假设我想向 foo 添加一个开关,使其递归到子目录中。
foo -a *.txt
将处理当前目录及其所有子目录中的所有文本文件。
我不明白这是如何完成的,因为当我的程序有机会看到 -a
时,shell 已经完成了扩展和用户的 *.txt< /code> 输入丢失。然而,有些常见的 Unix 程序就是以这种方式工作的。他们是如何做到的?
在Unix领域,我如何控制通配符扩展?
(通过子目录递归只是一个示例。理想情况下,我试图了解控制通配符扩展的通用解决方案。)
I'm writing a program, foo, in C++. It's typically invoked on the command line like this:
foo *.txt
My main()
receives the arguments in the normal way. On many systems, argv[1]
is literally *.txt
, and I have to call system routines to do the wildcard expansion. On Unix systems, however, the shell expands the wildcard before invoking my program, and all of the matching filenames will be in argv
.
Suppose I wanted to add a switch to foo that causes it to recurse into subdirectories.
foo -a *.txt
would process all text files in the current directory and all of its subdirectories.
I don't see how this is done, since, by the time my program gets a chance to see the -a
, then shell has already done the expansion and the user's *.txt
input is lost. Yet there are common Unix programs that work this way. How do they do it?
In Unix land, how can I control the wildcard expansion?
(Recursing through subdirectories is just one example. Ideally, I'm trying to understand the general solution to controlling the wildcard expansion.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您的程序对 shell 的命令行扩展没有影响。将调用哪个程序是在完成所有扩展之后确定的,因此以编程方式更改有关扩展的任何内容已经为时已晚。
另一方面,调用您的程序的用户可以创建他喜欢的任何命令行。 shell 允许您轻松防止通配符扩展,通常是将参数放在单引号中:
如果您的程序像这样调用,它将收到两个参数
-a
和*.txt
。在 Unix 上,如果不需要,您应该将其留给用户手动阻止通配符扩展。
You program has no influence over the shell's command line expansion. Which program will be called is determined after all the expansion is done, so it's already too late to change anything about the expansion programmatically.
The user calling your program, on the other hand, has the possibility to create whatever command line he likes. Shells allow you to easily prevent wildcard expansion, usually by putting the argument in single quotes:
If your program is called like that it will receive two parameters
-a
and*.txt
.On Unix, you should just leave it to the user to manually prevent wildcard expansion if it is not desired.
正如其他答案所说,外壳程序会进行通配符扩展 - 并且您可以通过将参数括在引号中来阻止它这样做。
请注意,选项
-R
和-r
通常用于指示递归 - 请参阅cp
、ls
等例子。假设您适当地组织事物,以便将通配符作为通配符传递给您的程序,并且您想要进行递归,那么 POSIX 提供了例程来帮助:
nftw
- 文件树遍历(递归访问)。fnmatch
,glob
,wordexp
- 进行文件名匹配和扩展还有
ftw
,与nftw非常相似code> 但它被标记为“过时”,因此新代码不应使用它。
阿德里安问:
为了使问题适应我的计算机上方便的位置,让我们回顾一下:
因此,我有一个子目录“mte”,其中包含三个文件。我有六个文件名称以“m”开头。
当我输入“ls -R1 m*”时,shell 会记录元字符“*”并使用其等效的
glob()
或wordexp()
来展开进入名单:然后 shell 安排运行 '
/bin/ls
' 9 个参数(程序名、选项-R1
、加上 7 个文件名和终止空指针)。ls
会打印其名称及其内容,并调用其等效的nftw()
来完成这项工作。As the other answers said, the shell does the wildcard expansion - and you stop it from doing so by enclosing arguments in quotes.
Note that options
-R
and-r
are usually used to indicate recursive - seecp
,ls
, etc for examples.Assuming you organize things appropriately so that wildcards are passed to your program as wildcards and you want to do recursion, then POSIX provides routines to help:
nftw
- file tree walk (recursive access).fnmatch
,glob
,wordexp
- to do filename matching and expansionThere is also
ftw
, which is very similar tonftw
but it is marked 'obsolescent' so new code should not use it.Adrian asked:
To adapt the question to a convenient location on my computer, let's review:
So, I have a sub-directory 'mte' that contains three files. And I have six files with names that start 'm'.
When I type 'ls -R1 m*', the shell notes the metacharacter '*' and uses its equivalent of
glob()
orwordexp()
to expand that into the list of names:Then the shell arranges to run '
/bin/ls
' with 9 arguments (program name, option-R1
, plus 7 file names and terminating null pointer).ls
command notes the options (recursive and single-column output), and gets to work.ls
prints its name and its contents, invoking its equivalent ofnftw()
to do the job.ls
processes the non-directory names first, and then processes the directory names in alphabetic order (by default), and does a depth-first scan of each directory.shell 的部分工作(在 Unix 上)是扩展命令行通配符参数。您可以使用引号来防止这种情况。
另外,在 Unix 系统上,“find”命令可以执行您想要的操作:
将从当前目录向下递归地列出所有文件。
因此,你可以做
Part of the shell's job (on Unix) is to expand command line wildcard arguments. You prevent this with quotes.
Also, on Unix systems, the "find" command does what you want:
will list all files recursively from the current directory down.
Thus, you could do
我想指出另一种关闭通配符扩展的方法。您可以使用
noglob
选项告诉您的 shell 停止扩展通配符。对于 bash,使用
set -o noglob
:对于 csh,使用
set noglob
:I wanted to point out another way to turn off wildcard expansion. You can tell your shell to stop expanding wildcards with the the
noglob
option.With bash use
set -o noglob
:And with csh, use
set noglob
: