使用 Perl 删除非常大的文件夹的最佳策略是什么?

发布于 08-27 21:20 字数 1205 浏览 13 评论 0原文

我需要删除给定文件夹下的所有内容(文件和文件夹)。问题是该文件夹内有数百万个文件和文件夹。所以我不想一次性加载所有文件名

逻辑应该是这样的:

  • 迭代一个文件夹而不加载所有内容
  • 获取文件或文件夹
  • 删除它 (详细说明文件或文件夹“X”已被删除)
  • 转到下一个

我正在尝试这样的事情:

sub main(){
  my ($rc, $help, $debug, $root)   = ();
  $rc = GetOptions ( "HELP"           => \$help,
                     "DEBUG"          => \$debug,
                     "ROOT=s"         => \$root);

  die "Bad command line options\n$usage\n" unless ($rc);
  if ($help) { print $usage; exit (0); }

  if ($debug) {
      warn "\nProceeding to execution with following parameters: \n";
      warn "===============================================================\n";
      warn "ROOT = $root\n";

  } # write debug information to STDERR

  print "\n Starting to delete...\n";  

  die "usage: $0 dir ..\n" unless $root;
  *name = *File::Find::name;
  find \&verbose, @ARGV;

}

sub verbose {
    if (!-l && -d _) {
        print "rmdir $name\n";
    } else {
        print "unlink $name\n";
    }
}

main();

它工作正常,但每当“find”读取大文件夹时,应用程序就会获取卡住了,我可以看到 Perl 的系统内存不断增加,直到超时。为什么?它是否试图一次性加载所有文件?

感谢您的帮助。

I need to delete all content (files and folders) under a given folder. The problems is the folder has millions of files and folders inside it. So I don't want to load all the file names in one go.

Logic should be like this:

  • iterate a folder without load everything
  • get a file or folder
  • delete it
    (verbose that the file or folder "X" was deleted)
  • go to the next one

I'm trying something like this:

sub main(){
  my ($rc, $help, $debug, $root)   = ();
  $rc = GetOptions ( "HELP"           => \$help,
                     "DEBUG"          => \$debug,
                     "ROOT=s"         => \$root);

  die "Bad command line options\n$usage\n" unless ($rc);
  if ($help) { print $usage; exit (0); }

  if ($debug) {
      warn "\nProceeding to execution with following parameters: \n";
      warn "===============================================================\n";
      warn "ROOT = $root\n";

  } # write debug information to STDERR

  print "\n Starting to delete...\n";  

  die "usage: $0 dir ..\n" unless $root;
  *name = *File::Find::name;
  find \&verbose, @ARGV;

}

sub verbose {
    if (!-l && -d _) {
        print "rmdir $name\n";
    } else {
        print "unlink $name\n";
    }
}

main();

It's working fine, but whenever "find" reads the huge folder, the application gets stuck and I can see the system memory for Perl increasing until timeout. Why? Is it trying to load all the files in one go?

Thanks for your help.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

呆°2024-09-03 21:20:20

File::Path 中的 remove_tree 函数可以可移植详细删除目录层次结构,如果需要,保留顶级目录。

use strict;
use warnings;
use File::Path qw(remove_tree);

my $dir = '/tmp/dir';
remove_tree($dir, {verbose => 1, keep_root => 1});

5.10 之前版本,使用 File::Pathrmtree 函数一个>。如果你仍然想要顶级目录,你可以再次mkdir它。

use File::Path;

my $dir = '/tmp/dir';
rmtree($dir, 1);  # 1 means verbose
mkdir $dir;

The remove_tree function from File::Path can portably and verbosely remove a directory hierarchy, keeping the top directory, if desired.

use strict;
use warnings;
use File::Path qw(remove_tree);

my $dir = '/tmp/dir';
remove_tree($dir, {verbose => 1, keep_root => 1});

Pre-5.10, use the rmtree function from File::Path. If you still want the top directory, you could just mkdir it again.

use File::Path;

my $dir = '/tmp/dir';
rmtree($dir, 1);  # 1 means verbose
mkdir $dir;
遇到2024-09-03 21:20:20

perlfaq 指出 File::Find 完成了这项艰苦的工作遍历目录,但工作并不难(假设您的目录树没有命名管道、块设备等):

sub traverse_directory {
    my $dir = shift;
    opendir my $dh, $dir;
    while (my $file = readdir($dh)) {
        next if $file eq "." || $file eq "..";
        if (-d "$dir/$file") {
            &traverse_directory("$dir/$file");
        } elsif (-f "$dir/$file") {
            # $dir/$file is a regular file
            # Do something with it, for example:
            print "Removing $dir/$file\n";
            unlink "$dir/$file" or warn "unlink $dir/$file failed: $!\n";
        } else {
            warn "$dir/$file is not a directory or regular file. Ignoring ...\n";
        }
    }
    closedir $dh;
    # $dir might be empty at this point. If you want to delete it:
    if (rmdir $dir) {
        print "Removed $dir/\n";
    } else {
        warn "rmdir $dir failed: $!\n";
    }
}

用您自己的代码替换文件或(可能)空目录,然后调用此函数在您要处理的树的根上一次。如果您没有遇到过,请查找 opendir/closedirreaddir-d-f 的含义他们之前。

The perlfaq points out that File::Find does the hard work of traversing a directory, but the work isn't that hard (assuming your directory tree is free of named pipes, block devices, etc.):

sub traverse_directory {
    my $dir = shift;
    opendir my $dh, $dir;
    while (my $file = readdir($dh)) {
        next if $file eq "." || $file eq "..";
        if (-d "$dir/$file") {
            &traverse_directory("$dir/$file");
        } elsif (-f "$dir/$file") {
            # $dir/$file is a regular file
            # Do something with it, for example:
            print "Removing $dir/$file\n";
            unlink "$dir/$file" or warn "unlink $dir/$file failed: $!\n";
        } else {
            warn "$dir/$file is not a directory or regular file. Ignoring ...\n";
        }
    }
    closedir $dh;
    # $dir might be empty at this point. If you want to delete it:
    if (rmdir $dir) {
        print "Removed $dir/\n";
    } else {
        warn "rmdir $dir failed: $!\n";
    }
}

Substitute your own code for doing something with a file or (possibly) empty directory, and call this function once on the root of the tree that you want to process. Lookup the meanings of opendir/closedir, readdir, -d, and -f if you haven't encountered them before.

世界如花海般美丽2024-09-03 21:20:20

有什么问题:

`rm -rf $folder`; // ??

What's wrong with:

`rm -rf $folder`; // ??
七婞2024-09-03 21:20:20

可以使用 File::Find 系统地遍历目录并删除其下的文件和目录。

You can use File::Find to systematically traverse the directory and delete the files and directories under it.

素手挽清风2024-09-03 21:20:20

好吧,我屈服并使用了 Perl 内置函数,但你应该使用 File::Path::rmtree 我完全忘记了:

#!/usr/bin/perl

use strict; use warnings;
use Cwd;
use File::Find;

my ($clean) = @ARGV;
die "specify directory to clean\n" unless defined $clean;

my $current_dir = getcwd;
chdir $clean
    or die "Cannot chdir to '$clean': $!\n";

finddepth(\&wanted => '.');

chdir $current_dir
    or die "Cannot chdir back to '$current_dir':$!\n";

sub wanted {
    return if /^[.][.]?\z/;
    warn "$File::Find::name\n";
    if ( -f ) {
        unlink or die "Cannot delete '$File::Find::name': $!\n";
    }
    elsif ( -d _ ) {
        rmdir or die "Cannot remove directory '$File::Find::name': $!\n";
    }
    return;
}

OK, I gave in and used Perl builtins but you should use File::Path::rmtree which I had totally forgotten about:

#!/usr/bin/perl

use strict; use warnings;
use Cwd;
use File::Find;

my ($clean) = @ARGV;
die "specify directory to clean\n" unless defined $clean;

my $current_dir = getcwd;
chdir $clean
    or die "Cannot chdir to '$clean': $!\n";

finddepth(\&wanted => '.');

chdir $current_dir
    or die "Cannot chdir back to '$current_dir':$!\n";

sub wanted {
    return if /^[.][.]?\z/;
    warn "$File::Find::name\n";
    if ( -f ) {
        unlink or die "Cannot delete '$File::Find::name': $!\n";
    }
    elsif ( -d _ ) {
        rmdir or die "Cannot remove directory '$File::Find::name': $!\n";
    }
    return;
}
不喜欢何必死缠烂打2024-09-03 21:20:20

下载 适用于 Windows 的 unix 工具,然后你可以执行 rm -rv 或其他操作。

Perl 是一个适用于很多用途的出色工具,但这个工具似乎最好由专门的工具来完成。

Download the unix tools for windows and then you can do rm -rv or whatever.

Perl is a great tool for a lot of purposes, but this one seems better done by a specialised tool.

许你一世情深2024-09-03 21:20:20

这是一种廉价的“跨平台”方法:

use Carp    qw<carp croak>;
use English qw<$OS_NAME>;
use File::Spec;  

my %deltree_op = ( nix => 'rm -rf %s', win => 'rmdir /S %s' );

my %group_for
    = ( ( map { $_ => 'nix' } qw<linux UNIX SunOS> )
      , ( map { $_ => 'win' } qw<MSWin32 WinNT>    )
      );

my $group_name = $group_for{$OS_NAME};
sub chop_tree { 
   my $full_path = shift;
   carp( "No directory $full_path exists! We're done." ) unless -e $full_path;
   croak( "No implementation for $OS_NAME!" ) unless $group_name;
   my $format = $deltree_op{$group_name};
   croak( "Could not find command format for group $group_name" ) unless $format;
   my $command = sprintf( $format, File::Spec->canonpath( $full_path ));
   qx{$command};
}

Here's a cheap "cross-platform" method:

use Carp    qw<carp croak>;
use English qw<$OS_NAME>;
use File::Spec;  

my %deltree_op = ( nix => 'rm -rf %s', win => 'rmdir /S %s' );

my %group_for
    = ( ( map { $_ => 'nix' } qw<linux UNIX SunOS> )
      , ( map { $_ => 'win' } qw<MSWin32 WinNT>    )
      );

my $group_name = $group_for{$OS_NAME};
sub chop_tree { 
   my $full_path = shift;
   carp( "No directory $full_path exists! We're done." ) unless -e $full_path;
   croak( "No implementation for $OS_NAME!" ) unless $group_name;
   my $format = $deltree_op{$group_name};
   croak( "Could not find command format for group $group_name" ) unless $format;
   my $command = sprintf( $format, File::Spec->canonpath( $full_path ));
   qx{$command};
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文