Perl 中带有单个标量参数的 system() 调用的退出代码
在具有多个管道的 Perl 脚本中,有一个使用单个标量参数的 system()
调用。该调用看起来或多或少像这样:
system("zcat /foo.gz | grep '^.{6}X|Y|Z' | awk '{print $2,$3,$4,$6}' | bzip2 > /foo.processed.bz2");
有问题的文件 (foo.gz
) 非常大,压缩后大小约为 2GB。我想这就是为什么它最初是通过系统调用完成的。
问题:
现在的问题是,这个系统调用似乎总是返回 0,无论其中一个系统命令是否失败。我认为这是因为它是通过 sh -c '...'
调用的。这是正确的吗?
如果仅传递单个标量参数,是否有方法检查 system()
调用是否成功?
有没有更好的方法来处理这样的大文件,以同样或更有效的方式(主要在速度方面)?
感谢您的任何提示,因为我对 Perl 不太熟悉。
There is a system()
call in a Perl script with multiple pipes, using a single scalar argument. The call looks more or less like this:
system("zcat /foo.gz | grep '^.{6}X|Y|Z' | awk '{print $2,$3,$4,$6}' | bzip2 > /foo.processed.bz2");
The file in question (foo.gz
) is quite large, about 2GB compressed in size. I guess that's why it was originally done via a system call.
Questions:
The problem now is, that this system call always seem to return 0, whether one of the system commands fail or not. I assume this is because it gets invoked via sh -c '...'
. Is that correct?
Is there a way to check if a system()
call was successful if only a single scalar argument is passed?
Is there a better way to process a large file like this, in a way thats equally or more efficient (in terms of speed mainly)?
Thanks for any hints as I am not really familiar with Perl.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
有两件事:
bzip2
命令的状态代码。对于此类事情,使用 Perl 模块总是更好。在这种情况下,我敢打赌 Perl 模块将比 shell 管道更快,并且您将对整个操作有更多的控制权。
有一个名为 IO::Compress 的集合,可以处理 Zip 和 BZip2 。
我使用 Archive::Zip 这是一个很棒的模块,但您想使用 Bzip2 压缩算法,而
Archive::Zip
无法处理。Two things:
bzip2
command.You're always better off using Perl modules for things like this. In this case, I bet the Perl modules will be even faster than the shell pipeline, and you'll have more control over the entire operation.
There's a set called IO::Compress that can handle both Zip and BZip2.
I use Archive::Zip which is a great module, but you want to use the Bzip2 compression algorithm, and
Archive::Zip
can't handle that.system()
返回/bin/sh
shell 返回的内容。当多个命令被管道化时,shell 会为每个命令分叉一个新进程,并返回链中最后一个命令的状态代码,在本例中为bzip2
。system()
returns what the/bin/sh
shell returns. When multiple commands are pipelined, the shell forks a new process for each of them and the status code of the last command in the chain is returned, in this casebzip2
.根据您的评论和回答,我现在会这样做:
请随意发表评论并投票赞成/反对。
Based on your comments and answers, I'd do it like that now:
Please feel free to comment and vote up/down.
你最好在 perl 本身内部进行文本处理 - 这就是 perl 的用途:)
system() 只返回 0 或 1。要捕获实际输出,请尝试通过反引号调用它: `command` 而不是 system('command ')
You'd be better doing the text processing from within perl itself - that's what perl's for :)
system() only ever returns 0 or 1. To capture actual output, try calling it via backticks: `command` rather than system('command')