使用 Perl 删除非常大的文件夹的最佳策略是什么?
我需要删除给定文件夹下的所有内容(文件和文件夹)。问题是该文件夹内有数百万个文件和文件夹。所以我不想一次性加载所有文件名。
逻辑应该是这样的:
- 迭代一个文件夹而不加载所有内容
- 获取文件或文件夹
- 删除它 (详细说明文件或文件夹“X”已被删除)
- 转到下一个
我正在尝试这样的事情:
sub main(){
my ($rc, $help, $debug, $root) = ();
$rc = GetOptions ( "HELP" => \$help,
"DEBUG" => \$debug,
"ROOT=s" => \$root);
die "Bad command line options\n$usage\n" unless ($rc);
if ($help) { print $usage; exit (0); }
if ($debug) {
warn "\nProceeding to execution with following parameters: \n";
warn "===============================================================\n";
warn "ROOT = $root\n";
} # write debug information to STDERR
print "\n Starting to delete...\n";
die "usage: $0 dir ..\n" unless $root;
*name = *File::Find::name;
find \&verbose, @ARGV;
}
sub verbose {
if (!-l && -d _) {
print "rmdir $name\n";
} else {
print "unlink $name\n";
}
}
main();
它工作正常,但每当“find”读取大文件夹时,应用程序就会获取卡住了,我可以看到 Perl 的系统内存不断增加,直到超时。为什么?它是否试图一次性加载所有文件?
感谢您的帮助。
I need to delete all content (files and folders) under a given folder. The problems is the folder has millions of files and folders inside it. So I don't want to load all the file names in one go.
Logic should be like this:
- iterate a folder without load everything
- get a file or folder
- delete it
(verbose that the file or folder "X" was deleted) - go to the next one
I'm trying something like this:
sub main(){
my ($rc, $help, $debug, $root) = ();
$rc = GetOptions ( "HELP" => \$help,
"DEBUG" => \$debug,
"ROOT=s" => \$root);
die "Bad command line options\n$usage\n" unless ($rc);
if ($help) { print $usage; exit (0); }
if ($debug) {
warn "\nProceeding to execution with following parameters: \n";
warn "===============================================================\n";
warn "ROOT = $root\n";
} # write debug information to STDERR
print "\n Starting to delete...\n";
die "usage: $0 dir ..\n" unless $root;
*name = *File::Find::name;
find \&verbose, @ARGV;
}
sub verbose {
if (!-l && -d _) {
print "rmdir $name\n";
} else {
print "unlink $name\n";
}
}
main();
It's working fine, but whenever "find" reads the huge folder, the application gets stuck and I can see the system memory for Perl increasing until timeout. Why? Is it trying to load all the files in one go?
Thanks for your help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
发布评论
评论(7)
perlfaq 指出 File::Find
完成了这项艰苦的工作遍历目录,但工作并不难(假设您的目录树没有命名管道、块设备等):
sub traverse_directory {
my $dir = shift;
opendir my $dh, $dir;
while (my $file = readdir($dh)) {
next if $file eq "." || $file eq "..";
if (-d "$dir/$file") {
&traverse_directory("$dir/$file");
} elsif (-f "$dir/$file") {
# $dir/$file is a regular file
# Do something with it, for example:
print "Removing $dir/$file\n";
unlink "$dir/$file" or warn "unlink $dir/$file failed: $!\n";
} else {
warn "$dir/$file is not a directory or regular file. Ignoring ...\n";
}
}
closedir $dh;
# $dir might be empty at this point. If you want to delete it:
if (rmdir $dir) {
print "Removed $dir/\n";
} else {
warn "rmdir $dir failed: $!\n";
}
}
用您自己的代码替换文件或(可能)空目录,然后调用此函数在您要处理的树的根上一次。如果您没有遇到过,请查找 opendir/closedir
、readdir
、-d
和 -f
的含义他们之前。
好吧,我屈服并使用了 Perl 内置函数,但你应该使用 File::Path::rmtree 我完全忘记了:
#!/usr/bin/perl
use strict; use warnings;
use Cwd;
use File::Find;
my ($clean) = @ARGV;
die "specify directory to clean\n" unless defined $clean;
my $current_dir = getcwd;
chdir $clean
or die "Cannot chdir to '$clean': $!\n";
finddepth(\&wanted => '.');
chdir $current_dir
or die "Cannot chdir back to '$current_dir':$!\n";
sub wanted {
return if /^[.][.]?\z/;
warn "$File::Find::name\n";
if ( -f ) {
unlink or die "Cannot delete '$File::Find::name': $!\n";
}
elsif ( -d _ ) {
rmdir or die "Cannot remove directory '$File::Find::name': $!\n";
}
return;
}
下载 适用于 Windows 的 unix 工具,然后你可以执行 rm -rv
或其他操作。
Perl 是一个适用于很多用途的出色工具,但这个工具似乎最好由专门的工具来完成。
这是一种廉价的“跨平台”方法:
use Carp qw<carp croak>;
use English qw<$OS_NAME>;
use File::Spec;
my %deltree_op = ( nix => 'rm -rf %s', win => 'rmdir /S %s' );
my %group_for
= ( ( map { $_ => 'nix' } qw<linux UNIX SunOS> )
, ( map { $_ => 'win' } qw<MSWin32 WinNT> )
);
my $group_name = $group_for{$OS_NAME};
sub chop_tree {
my $full_path = shift;
carp( "No directory $full_path exists! We're done." ) unless -e $full_path;
croak( "No implementation for $OS_NAME!" ) unless $group_name;
my $format = $deltree_op{$group_name};
croak( "Could not find command format for group $group_name" ) unless $format;
my $command = sprintf( $format, File::Spec->canonpath( $full_path ));
qx{$command};
}
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
File::Path 中的
remove_tree
函数可以可移植和详细删除目录层次结构,如果需要,保留顶级目录。5.10 之前版本,使用 File::Pathrmtree 函数一个>。如果你仍然想要顶级目录,你可以再次mkdir它。
The
remove_tree
function from File::Path can portably and verbosely remove a directory hierarchy, keeping the top directory, if desired.Pre-5.10, use the
rmtree
function from File::Path. If you still want the top directory, you could just mkdir it again.