Perl 中带有单个标量参数的 system() 调用的退出代码

发布于 2024-12-17 05:33:06 字数 545 浏览 0 评论 0原文

在具有多个管道的 Perl 脚本中,有一个使用单个标量参数的 system() 调用。该调用看起来或多或少像这样:

system("zcat /foo.gz | grep '^.{6}X|Y|Z' | awk '{print $2,$3,$4,$6}' | bzip2 > /foo.processed.bz2");

有问题的文件 (foo.gz) 非常大,压缩后大小约为 2GB。我想这就是为什么它最初是通过系统调用完成的。

问题:

现在的问题是,这个系统调用似乎总是返回 0,无论其中一个系统命令是否失败。我认为这是因为它是通过 sh -c '...' 调用的。这是正确的吗?

如果仅传递单个标量参数,是否有方法检查 system() 调用是否成功?

有没有更好的方法来处理这样的大文件,以同样或更有效的方式(主要在速度方面)?

感谢您的任何提示,因为我对 Perl 不太熟悉。

There is a system() call in a Perl script with multiple pipes, using a single scalar argument. The call looks more or less like this:

system("zcat /foo.gz | grep '^.{6}X|Y|Z' | awk '{print $2,$3,$4,$6}' | bzip2 > /foo.processed.bz2");

The file in question (foo.gz) is quite large, about 2GB compressed in size. I guess that's why it was originally done via a system call.

Questions:

The problem now is, that this system call always seem to return 0, whether one of the system commands fail or not. I assume this is because it gets invoked via sh -c '...'. Is that correct?

Is there a way to check if a system() call was successful if only a single scalar argument is passed?

Is there a better way to process a large file like this, in a way thats equally or more efficient (in terms of speed mainly)?

Thanks for any hints as I am not really familiar with Perl.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

情话已封尘 2024-12-24 05:33:06

有两件事:

  1. 当您执行系统调用时,返回的值是管道中的最后一个值。因此,您将获得 bzip2 命令的状态代码。
  2. 该程序这样做的原因是因为编写该程序的人可能不知道更多。我见过 Perl 程序使用系统调用来查找文件的基名、进行查找,甚至进行复制/重命名/移动。这些都是可以在 Perl 程序中更快、更轻松地完成的事情。而且,您不会遇到整个 Windows/Unix 兼容性问题。

对于此类事情,使用 Perl 模块总是更好。在这种情况下,我敢打赌 Perl 模块将比 shell 管道更快,并且您将对整个操作有更多的控制权。

有一个名为 IO::Compress 的集合,可以处理 Zip 和 BZip2 。

我使用 Archive::Zip 这是一个很棒的模块,但您想使用 Bzip2 压缩算法,而 Archive::Zip 无法处理。

Two things:

  1. When you do a system call, the value returned is the last value in the pipeline. Thus, you're getting the status code of the bzip2 command.
  2. The reason the program is doing this is because the people who wrote the program probably didn't know any better. I've seen Perl programs use system calls for finding the basename of the file, doing a find, and even doing a copy/rename/move. These are all things that can be done faster and easier inside the Perl program. And, you don't have the whole Windows/Unix compatibility issues.

You're always better off using Perl modules for things like this. In this case, I bet the Perl modules will be even faster than the shell pipeline, and you'll have more control over the entire operation.

There's a set called IO::Compress that can handle both Zip and BZip2.

I use Archive::Zip which is a great module, but you want to use the Bzip2 compression algorithm, and Archive::Zip can't handle that.

最佳男配角 2024-12-24 05:33:06

system() 返回 /bin/sh shell 返回的内容。当多个命令被管道化时,shell 会为每个命令分叉一个新进程,并返回链中最后一个命令的状态代码,在本例中为 bzip2

system() returns what the /bin/sh shell returns. When multiple commands are pipelined, the shell forks a new process for each of them and the status code of the last command in the chain is returned, in this case bzip2.

初熏 2024-12-24 05:33:06

根据您的评论和回答,我现在会这样做:

$infile =~ s/(.*\.gz)\s*$/gzip -dc < $1|/;
open(OUTFH, "| /bin/bzip > $outfile") or die "Can't open $outfile: $!";
open(INFH, $infile) or die "Can't open $infile: $!";
while (my $line = <INFH>) {
    if ($line =~ /^.{6}X|Y|Z) {
        # TODO: the awk part...
        print OUTFH $line;
    }
}
close(INFH);
close(OUTFH);

请随意发表评论并投票赞成/反对。

Based on your comments and answers, I'd do it like that now:

$infile =~ s/(.*\.gz)\s*$/gzip -dc < $1|/;
open(OUTFH, "| /bin/bzip > $outfile") or die "Can't open $outfile: $!";
open(INFH, $infile) or die "Can't open $infile: $!";
while (my $line = <INFH>) {
    if ($line =~ /^.{6}X|Y|Z) {
        # TODO: the awk part...
        print OUTFH $line;
    }
}
close(INFH);
close(OUTFH);

Please feel free to comment and vote up/down.

妞丶爷亲个 2024-12-24 05:33:06

你最好在 perl 本身内部进行文本处理 - 这就是 perl 的用途:)

system() 只返回 0 或 1。要捕获实际输出,请尝试通过反引号调用它: `command` 而不是 system('command ')

You'd be better doing the text processing from within perl itself - that's what perl's for :)

system() only ever returns 0 or 1. To capture actual output, try calling it via backticks: `command` rather than system('command')

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文