如何在 Perl 中打开文件数组?
在 perl 中,我从目录中读取文件,并且我想同时打开它们(但逐行),以便我可以执行一个将所有第 n 行一起使用的函数(例如串联)。
my $text = `ls | grep ".txt"`;
my @temps = split(/\n/,$text);
my @files;
for my $i (0..$#temps) {
my $file;
open($file,"<",$temps[$i]);
push(@files,$file);
}
my $concat;
for my $i (0..$#files) {
my @blah = <$files[$i]>;
$concat.=$blah;
}
print $concat;
我只是一堆错误、使用未初始化值和 GLOB(..) 错误。那么我怎样才能做到这一点呢?
In perl, I read in files from a directory, and I want to open them all simultaneously (but line by line) so that I can perform a function that uses all of their nth lines together (e.g. concatenation).
my $text = `ls | grep ".txt"`;
my @temps = split(/\n/,$text);
my @files;
for my $i (0..$#temps) {
my $file;
open($file,"<",$temps[$i]);
push(@files,$file);
}
my $concat;
for my $i (0..$#files) {
my @blah = <$files[$i]>;
$concat.=$blah;
}
print $concat;
I just a bunch of errors, use of uninitialized value, and GLOB(..) errors. So how can I make this work?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
很多问题。从调用“ls | grep”开始:)
让我们从一些代码开始:
首先,让我们获取文件列表:
但最好测试给定的名称是否与文件或目录相关:
现在,让我们打开这些文件进行读取他们:
但是,我们需要一种方法来处理错误 - 在我看来,最好的方法是添加:
在脚本的开头(以及安装 autodie,如果您还没有)。或者,您可以:
现在,我们有了它,让我们从所有输入中获取第一行(如您在示例中所示),并将其连接起来:
这非常好,并且可读,但仍然可以缩短,同时保持(在我看来)可读性:
效果是相同的 - $concatenate 包含所有文件的第一行。
因此,整个程序将如下所示:
现在,您可能不仅想连接第一行,还想连接所有行。在这种情况下,您需要这样的代码,而不是
$concatenated = ...
代码:A lot of issues. Starting with call to "ls | grep" :)
Let's start with some code:
First, let's get list of files:
But it would be better to test if the given name relates to file or directory:
Now, let's open these files to read them:
But, we need a way to handle errors - in my opinion the best way is to add:
At the beginning of script (and installation of autodie, if you don't have it yet). Alternatively you can:
Now, that we have it, let's get the first line (as you showed in your example) from all of the inputs, and concatenate it:
Which is perfectly fine, and readable, but still can be shortened, while maintaining (in my opinion) readability, to:
Effect is the same - $concatenated contains first lines from all files.
So, whole program would look like this:
Now, it might be that you want to concatenate not just first lines, but all of them. In this situation, instead of
$concatenated = ...
code, you'd need something like this:这是您的问题:
首先,
<$files[$i]>
不是有效的文件句柄读取。这是 GLOB(...) 错误的根源。请参阅 mobrule 的回答了解为什么会出现这种情况。因此,将其更改为:第二个问题,您正在混合
@blah
(名为blah
的数组)和$blah
(名为 <代码>废话)。这是“未初始化值”错误的根源 - $blah(标量)尚未初始化,但您正在使用它。如果您想要来自@blah
的第$n
行,请使用以下内容:我不想继续打败一匹死马,但我确实想解决一个更好的问题做某事的方法:
这会读取当前目录中具有“.txt”扩展名的所有文件的列表。这可行且有效,但可能相当慢 - 我们必须调用 shell,它必须分叉才能运行
ls
和grep
,而且产生一点开销。此外,ls
和grep
是简单且通用的程序,但不完全可移植。当然有更好的方法来做到这一点:简单、简短、纯 Perl、无分叉、无不可移植的 shell,而且我们不必读取字符串并然后分割它 - 我们可以只存储我们真正需要的条目。另外,修改通过测试的文件的条件也变得微不足道。假设我们最终不小心读取了文件
test.txt.gz
因为我们的正则表达式匹配:我们可以轻松地将这一行更改为:我们可以使用
grep
来做到这一点(我相信),但是当 Perl 内置了最强大的正则表达式库之一时,为什么要满足于grep
有限的正则表达式呢?Here is your problem:
First,
<$files[$i]>
isn't a valid filehandle read. This is the source of your GLOB(...) errors. See mobrule's answer for why this is the case. So change it to this:Second problem, You're mixing
@blah
(an array namedblah
) and$blah
(a scalar namedblah
). This is the source of your "uninitialized value" errors -$blah
(the scalar) hasn't been initialized, but you're using it. If you want the$n
-th line from@blah
, use this:I don't want to keep beating a dead horse, but I do want to address a better way to do something:
This reads in a list of all files in the current directory that have a ".txt" extension in them. This works, and is effective, but it can be rather slow - we have to call out to the shell, which has to fork off to run
ls
andgrep
, and that incurs a bit of overhead. Furthermore,ls
andgrep
are simple and common programs, but not exactly portable. Surely there's a better way to do this:Simple, short, pure Perl, no forking, no non-portable shells, and we don't have to read in the string and then split it - we can only store the entries we really need. Plus, it becomes trivial to modify the conditions for files that pass the test. Say we end up accidentally reading the file
test.txt.gz
because our regex matches: we can easily change that line to:We can do that one with
grep
(I believe), but why settle forgrep
's limited regular expressions when Perl has one of the most powerful regex libraries anywhere built-in?在
<>
运算符内使用大括号将$files[$i]
括起来,否则 Perl 将
<>
解释为文件 glob 运算符从文件句柄读取操作符。Use braces around
$files[$i]
inside the<>
operatorOtherwise Perl interprets
<>
as the file glob operator instead of the read-from-filehandle operator.您已经得到了一些很好的答案。解决该问题的另一种方法是创建一个包含文件中所有行的列表列表 (
@content
)。然后使用 List::MoreUtils 中的each_arrayref
函数,这将创建一个迭代器,从所有文件中生成第 1 行,然后生成第 2 行,依此类推。You've got some good answers already. Another way to tackle the problem is to create a list-of-lists containing all of the lines from the files (
@content
). Then use theeach_arrayref
function from List::MoreUtils, which will create an iterator that yields line 1 from all files, then line 2, etc.