使用 Perl 在 Unix/Linux 环境中下载和处理文件的管道

发布于 2024-08-28 22:58:11 字数 630 浏览 8 评论 0原文

我有一个要下载它们的文件 URL 列表:

http://somedomain.com/foo1.gz
http://somedomain.com/foo2.gz
http://somedomain.com/foo3.gz

我想要对每个文件执行以下操作:

  1. wgetnohup.
  2. 每次完成下载时,都会使用 myscript.sh 处理它们。

我所拥有的是这样的:

#! /usr/bin/perl

@files = glob("foo*.gz");

foreach $file (@files) {
   my $downurls = "http://somedomain.com/".$file;
   system("nohup wget $file &");
   system("./myscript.sh $file >> output.txt");
}

问题是我无法告诉上述管道文件何时完成下载。 所以现在 myscript.sh 无法正确执行。

实现这一目标的正确方法是什么?

I have a list of files URLS where I want to download them:

http://somedomain.com/foo1.gz
http://somedomain.com/foo2.gz
http://somedomain.com/foo3.gz

What I want to do is the following for each file:

  1. Download foo1,2.. in parallel with wget and nohup.
  2. Every time it completes download process them with myscript.sh

What I have is this:

#! /usr/bin/perl

@files = glob("foo*.gz");

foreach $file (@files) {
   my $downurls = "http://somedomain.com/".$file;
   system("nohup wget $file &");
   system("./myscript.sh $file >> output.txt");
}

The problem is that I can't tell the above pipeline when does the file finish downloading.
So now it myscript.sh doesn't get executed properly.

What's the right way to achieve this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

假情假意假温柔 2024-09-04 22:58:11

为什么要使用 perl 来执行此操作。使用 bash 代替。下面只是一个示例。

#!/bin/bash

for file in foo1 foo2 foo3
do
    wget http://samedomain.com/$file.gz .

    if [ -f $file.gz ];
    then
        ./myscript.sh $file.gz >> output.txt
    fi
done

Why to do this using perl. use bash instead. Below is just a sample.

#!/bin/bash

for file in foo1 foo2 foo3
do
    wget http://samedomain.com/$file.gz .

    if [ -f $file.gz ];
    then
        ./myscript.sh $file.gz >> output.txt
    fi
done
笑忘罢 2024-09-04 22:58:11

尝试使用 && 组合命令,以便第二个命令仅在第一个命令成功完成后运行。

system("(nohup wget $file  && ./myscript.sh $file >> output.txt) &");

Try combining the commands using &&, so that the 2nd one runs only after the 1st one completes successfully.

system("(nohup wget $file  && ./myscript.sh $file >> output.txt) &");
谢绝鈎搭 2024-09-04 22:58:11

如果您想要并行处理,您可以通过 fork 自己完成,或者使用内置模块来为您处理。尝试Parallel::ForkManager。您可以在 如何在 Perl 中管理分叉池? 中查看有关其用法的更多信息,但该模块的 CPAN 页面将获得真正有用​​的信息。你可能想要这样的东西:

use Parallel::ForkManager;

my $MAX_PROCESSES = 8; # 8 parallel processes max
my $pm = new Parallel::ForkManager($MAX_PROCESSES);

my @files = glob("foo*.gz");

foreach $file (@all_data) {
  # Forks and returns the pid for the child:
  my $pid = $pm->start and next; 

  my $downurls = "http://somedomain.com/".$file;
  system("wget $file");
  system("./myscript.sh $file >> output.txt");

  $pm->finish; # Terminates the child process
}

print "All done!\n";

If you want parallel processing, you can do it yourself with forking, or use a built in module to handle it for you. Try Parallel::ForkManager. You can see a bit more on it's usage in How can I manage a fork pool in Perl?, but the CPAN page for the module will have the real useful info. You probably want something like this:

use Parallel::ForkManager;

my $MAX_PROCESSES = 8; # 8 parallel processes max
my $pm = new Parallel::ForkManager($MAX_PROCESSES);

my @files = glob("foo*.gz");

foreach $file (@all_data) {
  # Forks and returns the pid for the child:
  my $pid = $pm->start and next; 

  my $downurls = "http://somedomain.com/".$file;
  system("wget $file");
  system("./myscript.sh $file >> output.txt");

  $pm->finish; # Terminates the child process
}

print "All done!\n";
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文