cat/Xargs/命令 VS for/bash/命令

发布于 2024-08-01 22:14:33 字数 479 浏览 13 评论 0原文

《Linux 101 Hacks》一书的第 38 页建议:

cat url-list.txt | xargs wget –c

我通常会这样做:

for i in `cat url-list.txt`
   do
      wget -c $i
   done 

除了长度之外,是否还有其他东西让 xargs 技术优于 bash 中旧的良好的 for 循环技术?

添加了

C 源代码 似乎只有一个分支。 相比之下,有多少个分支有 bash-combo?请详细说明这个问题。

The page 38 of the book Linux 101 Hacks suggests:

cat url-list.txt | xargs wget –c

I usually do:

for i in `cat url-list.txt`
   do
      wget -c $i
   done 

Is there some thing, other than length, where the xargs-technique is superior to the old good for-loop-technique in bash?

Added

The C source code seems to have only one fork. In contrast, how many forks have the bash-combo? Please, elaborate on the issue.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

香草可樂 2024-08-08 22:14:33

来自 xargs 的 UNIX 手册页的基本原理部分< /a>. (有趣的是,此部分没有出现在 xargs 的 OS X BSD 版本中,也没有出现在 GNU 版本中。)

xargs 的经典应用
实用程序与
找到减少数量的效用
通过简单的使用启动的流程
find -exec 组合的。 这
xargs 实用程序还用于强制执行
所需内存的上限
启动一个进程。 以此为基础在
请注意,这卷 POSIX.1-2008
仅选择了最少的功能
必填。

在后续操作中,您询问另一个版本将有多少个分叉。 Jim 已经回答了这个问题:每次迭代一个。 有多少次迭代? 不可能给出确切的数字,但很容易回答一般性问题。 您的 url-list.txt 文件中有多少行?

还有其他一些其他考虑因素。 xargs 需要额外注意带有空格或其他禁止字符的文件名,并且 -exec 有一个选项 (+),它将处理分组为批次。 因此,并不是每个人都喜欢 xargs,而且也许它并不适合所有情况。

请参阅以下链接:

From the Rationale section of a UNIX manpage for xargs. (Interestingly this section doesn't appear in the OS X BSD version of xargs, nor in the GNU version.)

The classic application of the xargs
utility is in conjunction with the
find utility to reduce the number of
processes launched by a simplistic use
of the find -exec combination. The
xargs utility is also used to enforce
an upper limit on memory required to
launch a process. With this basis in
mind, this volume of POSIX.1-2008
selected only the minimal features
required.

In your follow-up, you ask how many forks the other version will have. Jim already answered this: one per iteration. How many iterations are there? It's impossible to give an exact number, but easy to answer the general question. How many lines are there in your url-list.txt file?

There are other some other considerations. xargs requires extra care for filenames with spaces or other no-no characters, and -exec has an option (+), that groups processing into batches. So, not everyone prefers xargs, and perhaps it's not best for all situations.

See these links:

权谋诡计 2024-08-08 22:14:33

还要考虑:

xargs -I'{}' wget -c '{}' < url-list.txt

但 wget 提供了一种更好的方法:

wget -c -i url-list.txt

关于 xargs 与循环的考虑,当含义和实现相对“简单”和“清晰”时,我更喜欢 xargs,否则,我使用循环。

Also consider:

xargs -I'{}' wget -c '{}' < url-list.txt

but wget provides an even better means for the same:

wget -c -i url-list.txt

With respect to the xargs versus loop consideration, i prefer xargs when the meaning and implementation are relatively "simple" and "clear", otherwise, i use loops.

身边 2024-08-08 22:14:33

xargs 还允许您拥有一个巨大的列表,这对于“for”版本来说是不可能的,因为 shell 使用长度有限的命令行。

xargs will also allow you to have a huge list, which is not possible with the "for" version because the shell uses command lines limited in length.

悲歌长辞 2024-08-08 22:14:33

xargs 旨在为其派生的每个进程处理多个输入。 在输入上使用 for 循环的 shell 脚本必须为每个输入创建一个新进程。 避免每个进程的开销可以使 xargs 解决方案显着提高性能。

xargs is designed to process multiple inputs for each process it forks. A shell script with a for loop over its inputs must fork a new process for each input. Avoiding that per-process overhead can give an xargs solution a significant performance enhancement.

故人如初 2024-08-08 22:14:33

我更喜欢使用 xargs 内置的并行处理,而不是 GNU/Parallel。 添加 -P 以指示并行执行多少个 fork。 就像...

 seq 1 10 | xargs -n 1 -P 3 echo

将在 3 个不同的核心上使用 3 个分支进行计算。 现代 GNU Xargs 支持这一点。 您必须亲自验证是否使用 BSD 还是 Solaris。

instead of GNU/Parallel i prefer using xargs' built in parallel processing. Add -P to indicate how many forks to perform in parallel. As in...

 seq 1 10 | xargs -n 1 -P 3 echo

would use 3 forks on 3 different cores for computation. This is supported by modern GNU Xargs. You will have to verify for yourself if using BSD or Solaris.

九公里浅绿 2024-08-08 22:14:33

我能想到的一个优点是,如果您有很多文件,它可能会稍微快一些,因为启动新进程没有太多开销。

不过,我并不是真正的 bash 专家,所以可能还有其他原因导致它更好(或更差)。

One advantage I can think of is that, if you have lots of files, it could be slightly faster since you don't have as much overhead from starting new processes.

I'm not really a bash expert though, so there could be other reasons it's better (or worse).

乜一 2024-08-08 22:14:33

根据您的互联网连接,您可能需要使用 GNU Parallel http://www.gnu.org/software /parallel/ 并行运行它。

cat url-list.txt | parallel wget -c

Depending on your internet connection you may want to use GNU Parallel http://www.gnu.org/software/parallel/ to run it in parallel.

cat url-list.txt | parallel wget -c
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文