控制 C 或 C 中的 shell 命令行通配符扩展++

发布于 2024-08-28 14:57:42 字数 580 浏览 7 评论 0原文

我正在用 C++ 编写一个程序 foo。它通常在命令行上调用,如下所示:

foo *.txt

我的 main() 以正常方式接收参数。在许多系统上,argv[1] 实际上是 *.txt,我必须调用系统例程来进行通配符扩展。然而,在 Unix 系统上,shell 在调用我的程序之前会扩展通配符,并且所有匹配的文件名都将位于 argv 中。

假设我想向 foo 添加一个开关,使其递归到子目录中。

foo -a *.txt

将处理当前目录及其所有子目录中的所有文本文件。

我不明白这是如何完成的,因为当我的程序有机会看到 -a 时,shell 已经完成了扩展和用户的 *.txt< /code> 输入丢失。然而,有些常见的 Unix 程序就是以这种方式工作的。他们是如何做到的?

在Unix领域,我如何控制通配符扩展?

(通过子目录递归只是一个示例。理想情况下,我试图了解控制通配符扩展的通用解决方案。)

I'm writing a program, foo, in C++. It's typically invoked on the command line like this:

foo *.txt

My main() receives the arguments in the normal way. On many systems, argv[1] is literally *.txt, and I have to call system routines to do the wildcard expansion. On Unix systems, however, the shell expands the wildcard before invoking my program, and all of the matching filenames will be in argv.

Suppose I wanted to add a switch to foo that causes it to recurse into subdirectories.

foo -a *.txt

would process all text files in the current directory and all of its subdirectories.

I don't see how this is done, since, by the time my program gets a chance to see the -a, then shell has already done the expansion and the user's *.txt input is lost. Yet there are common Unix programs that work this way. How do they do it?

In Unix land, how can I control the wildcard expansion?

(Recursing through subdirectories is just one example. Ideally, I'm trying to understand the general solution to controlling the wildcard expansion.)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

悲凉≈ 2024-09-04 14:57:42

您的程序对 shell 的命令行扩展没有影响。将调用哪个程序是在完成所有扩展之后确定的,因此以编程方式更改有关扩展的任何内容已经为时已晚。

另一方面,调用您的程序的用户可以创建他喜欢的任何命令行。 shell 允许您轻松防止通配符扩展,通常是将参数放在单引号中:

program -a '*.txt'

如果您的程序像这样调用,它将收到两个参数 -a*.txt

在 Unix 上,如果不需要,您应该将其留给用户手动阻止通配符扩展。

You program has no influence over the shell's command line expansion. Which program will be called is determined after all the expansion is done, so it's already too late to change anything about the expansion programmatically.

The user calling your program, on the other hand, has the possibility to create whatever command line he likes. Shells allow you to easily prevent wildcard expansion, usually by putting the argument in single quotes:

program -a '*.txt'

If your program is called like that it will receive two parameters -a and *.txt.

On Unix, you should just leave it to the user to manually prevent wildcard expansion if it is not desired.

本王不退位尔等都是臣 2024-09-04 14:57:42

正如其他答案所说,外壳程序会进行通配符扩展 - 并且您可以通过将参数括在引号中来阻止它这样做。

请注意,选项 -R-r 通常用于指示递归 - 请参阅 cpls 等例子。

假设您适当地组织事物,以便将通配符作为通配符传递给您的程序,并且您想要进行递归,那么 POSIX 提供了例程来帮助:

还有ftw,与nftw非常相似code> 但它被标记为“过时”,因此新代码不应使用它。


阿德里安问:

但我可以说 ls -R *.txt 不带单引号并获得递归列表。这是如何运作的?

为了使问题适应我的计算机上方便的位置,让我们回顾一下:

$ ls -F | grep '^m'
makefile
mapmain.pl
minimac.group
minimac.passwd
minimac_13.terminal
mkmax.sql.bz2
mte/
$ ls -R1 m*
makefile
mapmain.pl
minimac.group
minimac.passwd
minimac_13.terminal
mkmax.sql.bz2

mte:
multithread.ec
multithread.ec.original
multithread2.ec
$

因此,我有一个子目录“mte”,其中包含三个文件。我有六个文件名称以“m”开头。

  • 当我输入“ls -R1 m*”时,shell 会记录元字符“*”并使用其等效的 glob()wordexp() 来展开进入名单:

    1. 生成文件
    2. mapmain.pl
    3. minimac.group
    4. minimac.passwd
    5. minimac_13.terminal
    6. mkmax.sql.bz2
    7. mte
  • 然后 shell 安排运行 '/bin/ls' 9 个参数(程序名、选项 -R1、加上 7 个文件名和终止空指针)。

  • ls 命令记录选项(递归和单列输出),然后开始工作。
    • 前 6 个名称(碰巧)是简单文件,因此不需要执行任何递归操作。
    • 最后一个名称是一个目录,因此 ls 会打印其名称及其内容,并调用其等效的 nftw() 来完成这项工作。
    • 至此,大功告成。
  • 这个未经设计的示例没有显示当存在多个目录时会发生什么,因此上面的描述过度简化了处理。
  • 具体来说,ls 首先处理非目录名称,然后按字母顺序(默认)处理目录名称,并对每个目录进行深度优先扫描。

As the other answers said, the shell does the wildcard expansion - and you stop it from doing so by enclosing arguments in quotes.

Note that options -R and -r are usually used to indicate recursive - see cp, ls, etc for examples.

Assuming you organize things appropriately so that wildcards are passed to your program as wildcards and you want to do recursion, then POSIX provides routines to help:

There is also ftw, which is very similar to nftw but it is marked 'obsolescent' so new code should not use it.


Adrian asked:

But I can say ls -R *.txt without single quotes and get a recursive listing. How does that work?

To adapt the question to a convenient location on my computer, let's review:

$ ls -F | grep '^m'
makefile
mapmain.pl
minimac.group
minimac.passwd
minimac_13.terminal
mkmax.sql.bz2
mte/
$ ls -R1 m*
makefile
mapmain.pl
minimac.group
minimac.passwd
minimac_13.terminal
mkmax.sql.bz2

mte:
multithread.ec
multithread.ec.original
multithread2.ec
$

So, I have a sub-directory 'mte' that contains three files. And I have six files with names that start 'm'.

  • When I type 'ls -R1 m*', the shell notes the metacharacter '*' and uses its equivalent of glob() or wordexp() to expand that into the list of names:

    1. makefile
    2. mapmain.pl
    3. minimac.group
    4. minimac.passwd
    5. minimac_13.terminal
    6. mkmax.sql.bz2
    7. mte
  • Then the shell arranges to run '/bin/ls' with 9 arguments (program name, option -R1, plus 7 file names and terminating null pointer).

  • The ls command notes the options (recursive and single-column output), and gets to work.
    • The first 6 names (as it happens) are simple files, so there is nothing recursive to do.
    • The last name is a directory, so ls prints its name and its contents, invoking its equivalent of nftw() to do the job.
    • At this point, it is done.
  • This uncontrived example doesn't show what happens when there are multiple directories, and so the description above over-simplifies the processing.
  • Specifically, ls processes the non-directory names first, and then processes the directory names in alphabetic order (by default), and does a depth-first scan of each directory.
养猫人 2024-09-04 14:57:42
foo -a '*.txt'

shell 的部分工作(在 Unix 上)是扩展命令行通配符参数。您可以使用引号来防止这种情况。

另外,在 Unix 系统上,“find”命令可以执行您想要的操作:

find . -name '*.txt'

将从当前目录向下递归地列出所有文件。

因此,你可以做

foo `find . -name '*.txt'`
foo -a '*.txt'

Part of the shell's job (on Unix) is to expand command line wildcard arguments. You prevent this with quotes.

Also, on Unix systems, the "find" command does what you want:

find . -name '*.txt'

will list all files recursively from the current directory down.

Thus, you could do

foo `find . -name '*.txt'`
爱人如己 2024-09-04 14:57:42

我想指出另一种关闭通配符扩展的方法。您可以使用 noglob 选项告诉您的 shell 停止扩展通配符。

对于 bash,使用 set -o noglob

> touch a b c
> echo *
a b c
> set -o noglob
> echo *
*

对于 csh,使用 set noglob

> echo *
a b c
> set noglob
> echo *
*

I wanted to point out another way to turn off wildcard expansion. You can tell your shell to stop expanding wildcards with the the noglob option.

With bash use set -o noglob:

> touch a b c
> echo *
a b c
> set -o noglob
> echo *
*

And with csh, use set noglob:

> echo *
a b c
> set noglob
> echo *
*
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文