如何在同一目录上多次有效地使用 Perl 的 readdir ?

发布于 2024-08-16 10:11:39 字数 381 浏览 4 评论 0原文

我在使用 Perl 的 readdir() 时遇到了一个问题。我想收集目录中具有与我指定的前缀文件名相同的所有文件。因此,对于每个前缀,我需要使用 Perl 的 readdir() 来 grep 所有相关文件。

假设前缀是“abc”,有几个文件的名称为“abc_1”、“abc_2”等。

但是,我注意到如果我将 opendir、closedir 放在循环之外(循环遍历文件名前缀列表) ,我只能 grep 目录中的第一个前缀 - 以下所有 grep 都失败了。如果我选择每次在循环中调用 opendir 和 closeir ,它工作得很好,但我担心它根本没有效率。

我的问题是如何才能提高效率?奇怪的是我不能在循环中多次调用 readdir 。

预先非常感谢!

-斤

I had a question using Perl's readdir(). I want to gather all the files in a directory that have the same prefix file name I specified. So, for each prefix, I need to use Perl's readdir() to grep all related files.

Suppose the prefix is "abc", there are several files with the names "abc_1", "abc_2", etc.

However, I noticed that if I put opendir, closedir outside of a loop (loop through a list of file name prefixes), I can only grep the very first prefix from the dir -- all the following grepping failed. If I chose to call opendir and closedir each time in the loop, it worked fine but I'm afraid it is not efficient at all.

My question is how can I make it more efficient? It is weird that I can't call readdir multiple times in a loop.

Thanks a lot in advance!

-Jin

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

遇见了你 2024-08-23 10:11:39

目录(和文件)句柄是迭代器。从其中读取数据会消耗数据,您需要存储该数据或重置迭代器的位置。关闭和重新开放是艰难的道路;使用 rewinddir 代替。

或者,使用 glob 一次完成读取和过滤步。

Directory (and file) handles are iterators. Reading from one consumes data, you need to either store that data or reset the position of the iterator. Closing and reopening is the hard way; use rewinddir instead.

Alternately, use glob to do the reading and filtering in one step.

早乙女 2024-08-23 10:11:39

为什么不读取所有文件一次,然后对该列表执行过滤?

Why don't you read all the files once and then perform the filtering on that list?

哥,最终变帅啦 2024-08-23 10:11:39

此时 rewinddir() 会有帮助吗?

Would rewinddir() be of assistance at this juncture?

叹沉浮 2024-08-23 10:11:39

为什么不让 @files = 呢?

Why dontcha just let @files = <abc_*>?

不气馁 2024-08-23 10:11:39

使用 Text::Trie 模块对文件进行分组一次通过 readdir

use File::Spec::Functions qw/ catfile /;
use Text::Trie qw/ Trie walkTrie /;

sub group_files {
  my($dir,$pattern) = @_;

  opendir my $dh, $dir or die "$0: opendir $dir: $!";

  my @trie = Trie readdir $dh;

  my @groups;
  my @prefix;
  my $group = [];

  my $exitnode = sub {
    pop @prefix;
    unless (@prefix) {
      push @groups => $group if @$group;
      $group = [];
    }
  };

  my $leaf = sub {
    local $_ = join "" => @prefix;
    if (/$pattern/) {
      my $full = catfile $dir => "$_$_[0]";
      push @$group => $full if -f $full;
    }
    $exitnode->() unless @prefix;
  };

  my $node = sub { push @prefix => $_[0] };

  @$_[0,1,5] = ($leaf, $node, $exitnode) for \my @callbacks;
  walkTrie @callbacks => @trie;

  wantarray ? @groups : \@groups;
}

您可以使用它,例如

my($pattern,$dir) = @ARGV;

$pattern //= "^";
$dir     //= ".";

my $qr = eval "qr/$pattern/" || die "$0: bad pattern ($pattern)\n";
my @groups = group_files $dir, $qr;

use Data::Dumper;
print Dumper \@groups;

$ ls
abc_1  abc_12  abc_2  abc_3  abc_4  prefixes  xy_7  xyz_1  xyz_2  xyz_3

$ ./prefixes
$VAR1 = [
          [
            './prefixes'
          ],
          [
            './abc_4',
            './abc_1',
            './abc_12',
            './abc_3',
            './abc_2'
          ],
          [
            './xy_7',
            './xyz_1',
            './xyz_3',
            './xyz_2'
          ]
        ];

使用可选的正则表达式参数作为前缀的谓词:

$ ./prefixes '^.{3,}'
$VAR1 = [
          [
            './abc_4',
            './abc_1',
            './abc_12',
            './abc_3',
            './abc_2'
          ],
          [
            './xyz_1',
            './xyz_3',
            './xyz_2'
          ]
        ];

$ ./prefixes '^.{2,}'
$VAR1 = [
          [
            './abc_4',
            './abc_1',
            './abc_12',
            './abc_3',
            './abc_2'
          ],
          [
            './xy_7',
            './xyz_1',
            './xyz_3',
            './xyz_2'
          ]
        ];

Use the Text::Trie module to group files in one pass through readdir:

use File::Spec::Functions qw/ catfile /;
use Text::Trie qw/ Trie walkTrie /;

sub group_files {
  my($dir,$pattern) = @_;

  opendir my $dh, $dir or die "$0: opendir $dir: $!";

  my @trie = Trie readdir $dh;

  my @groups;
  my @prefix;
  my $group = [];

  my $exitnode = sub {
    pop @prefix;
    unless (@prefix) {
      push @groups => $group if @$group;
      $group = [];
    }
  };

  my $leaf = sub {
    local $_ = join "" => @prefix;
    if (/$pattern/) {
      my $full = catfile $dir => "$_$_[0]";
      push @$group => $full if -f $full;
    }
    $exitnode->() unless @prefix;
  };

  my $node = sub { push @prefix => $_[0] };

  @$_[0,1,5] = ($leaf, $node, $exitnode) for \my @callbacks;
  walkTrie @callbacks => @trie;

  wantarray ? @groups : \@groups;
}

You might use it as in

my($pattern,$dir) = @ARGV;

$pattern //= "^";
$dir     //= ".";

my $qr = eval "qr/$pattern/" || die "$0: bad pattern ($pattern)\n";
my @groups = group_files $dir, $qr;

use Data::Dumper;
print Dumper \@groups;

For example:

$ ls
abc_1  abc_12  abc_2  abc_3  abc_4  prefixes  xy_7  xyz_1  xyz_2  xyz_3

$ ./prefixes
$VAR1 = [
          [
            './prefixes'
          ],
          [
            './abc_4',
            './abc_1',
            './abc_12',
            './abc_3',
            './abc_2'
          ],
          [
            './xy_7',
            './xyz_1',
            './xyz_3',
            './xyz_2'
          ]
        ];

Use the optional regular-expression argument as a predicate on prefixes:

$ ./prefixes '^.{3,}'
$VAR1 = [
          [
            './abc_4',
            './abc_1',
            './abc_12',
            './abc_3',
            './abc_2'
          ],
          [
            './xyz_1',
            './xyz_3',
            './xyz_2'
          ]
        ];

$ ./prefixes '^.{2,}'
$VAR1 = [
          [
            './abc_4',
            './abc_1',
            './abc_12',
            './abc_3',
            './abc_2'
          ],
          [
            './xy_7',
            './xyz_1',
            './xyz_3',
            './xyz_2'
          ]
        ];
雨的味道风的声音 2024-08-23 10:11:39

我将在一次传递中对其进行编码,如下所示:

while readdir() returns a file name
    if the file prefix has not been seen before
        record prefix and create directory for this prefix
    end if
    move (copy?) file to correct directory
end while

对于分析保留,这里有一些应该可以工作的(未经测试的)代码。错误处理留给读者作为练习。

require File::Copy;

my $old_base_dir = "original_directory_path";
opendir (my $dir_handle, "$old_base_dir");

my %dir_list;
my $new_base_dir = "new_directory_path";

while (my $file_name = readdir($dir_handle)) {
    next if ! -f $file_name;   # only move regular files
    (my $prefix) = split /_/, $file_name, 1; # assume first _ marks end of prefix

    mkdir "$new_base_dir/$prefix" unless exists $dir_list{$prefix};

    move("$old_base_dir/$file_name", "$new_base_dir/$file_name"); # assume unix system
}

closedir($dir_handle};

I would code this in a single pass as follows:

while readdir() returns a file name
    if the file prefix has not been seen before
        record prefix and create directory for this prefix
    end if
    move (copy?) file to correct directory
end while

For the anally retentive here is some (untested) code that should work. Error handling is left as an exercise for the reader.

require File::Copy;

my $old_base_dir = "original_directory_path";
opendir (my $dir_handle, "$old_base_dir");

my %dir_list;
my $new_base_dir = "new_directory_path";

while (my $file_name = readdir($dir_handle)) {
    next if ! -f $file_name;   # only move regular files
    (my $prefix) = split /_/, $file_name, 1; # assume first _ marks end of prefix

    mkdir "$new_base_dir/$prefix" unless exists $dir_list{$prefix};

    move("$old_base_dir/$file_name", "$new_base_dir/$file_name"); # assume unix system
}

closedir($dir_handle};
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文