如何比较 tar 存档和目录中的文件列表?

发布于 2024-08-01 23:37:25 字数 273 浏览 11 评论 0原文

我还在学习 Perl。 任何人都可以建议我比较 .tar.gz 和目录路径中的文件的 Perl 代码。

假设我有几天前获取的以下目录路径的 tar.gz 备份。

a/file1
a/file2
a/file3
a/b/file4
a/b/file5
a/c/file5
a/b/d/file and so on..

现在我想将此路径下的文件和目录与tar.gz备份文件进行比较。

请建议 Perl 代码来做到这一点。

I am still learning Perl. Can anyone please suggest me the Perl code to compare files from .tar.gz and a directory path.

Let's say I have tar.gz backup of following directory path which I have taken few days back.

a/file1
a/file2
a/file3
a/b/file4
a/b/file5
a/c/file5
a/b/d/file and so on..

Now I want to compare files and directories under this path with the tar.gz backup file.

Please suggest Perl code to do that.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

入画浅相思 2024-08-08 23:37:25

Perl 在这方面确实有些过分了。 shell 脚本就可以了。 不过,您需要采取的步骤:

  • 将 tar 解压到某个临时文件夹中。
  • diff -uR 两个文件夹并将输出重定向到某处(或者根据情况通过管道传输到 less
  • 清理临时文件夹。

你就完成了。 不应超过 5-6 行。 一些快速且未经测试的东西:

#!/bin/sh
mkdir $TEMP/$
tar -xz -f ../backups/backup.tgz $TEMP/$
diff -uR $TEMP/$ ./ | less
rm -rf $TEMP/$

Perl is kind of overkill for this, really. A shell script would do fine. The steps you need to take though:

  • Extract the tar to a temporary folder somewhere.
  • diff -uR the two folders and redirect the output somewhere (or perhaps pipe to less as appropriate)
  • Clean up the temporary folder.

And you're done. Shouldn't be more than 5-6 lines. Something quick and untested:

#!/bin/sh
mkdir $TEMP/$
tar -xz -f ../backups/backup.tgz $TEMP/$
diff -uR $TEMP/$ ./ | less
rm -rf $TEMP/$
匿名。 2024-08-08 23:37:25

下面是一个示例,用于检查存档中的每个文件是否也存在于文件夹中。

# $1 is the file to test
# $2 is the base folder
for file in $( tar --list -f $1 | perl -pe'chomp;$_=qq["'$2'$_" ]' )
do
  # work around bash deficiency
  if [[ -e "$( perl -eprint$file )" ]]
    then
      echo "   $file"
    else
      echo "no $file"
  fi
done

这就是我测试的方法:

我删除/重命名了 config,然后运行以下命令:

bash test Downloads/update-dnsomatic-0.1.2.tar.gz Downloads/

输出如下:

   "Downloads/update-dnsomatic-0.1.2/"
no "Downloads/update-dnsomatic-0.1.2/config"
   "Downloads/update-dnsomatic-0.1.2/update-dnsomatic"
   "Downloads/update-dnsomatic-0.1.2/README"
   "Downloads/update-dnsomatic-0.1.2/install.sh"

我是 bash / shell 编程新手,所以可能有更好的方法来做到这一点。

Heres an example that checks to see if every file that is in an archive, also exists in a folder.

# $1 is the file to test
# $2 is the base folder
for file in $( tar --list -f $1 | perl -pe'chomp;$_=qq["'$2'$_" ]' )
do
  # work around bash deficiency
  if [[ -e "$( perl -eprint$file )" ]]
    then
      echo "   $file"
    else
      echo "no $file"
  fi
done

This is how I tested this:

I removed / renamed config, then ran the following:

bash test Downloads/update-dnsomatic-0.1.2.tar.gz Downloads/

Which gave the output of:

   "Downloads/update-dnsomatic-0.1.2/"
no "Downloads/update-dnsomatic-0.1.2/config"
   "Downloads/update-dnsomatic-0.1.2/update-dnsomatic"
   "Downloads/update-dnsomatic-0.1.2/README"
   "Downloads/update-dnsomatic-0.1.2/install.sh"

I am new to bash / shell programming, so there is probably a better way to do this.

谈场末日恋爱 2024-08-08 23:37:25

Archive::TarFile::Find 模块会很有帮助。 下面显示了一个基本示例。 它只是打印有关 tar 中的文件和目录树中的文件的信息。

从您的问题中不清楚您要如何比较文件。 如果您需要比较实际内容,可能需要使用 Archive::Tar::File 中的 get_content() 方法。 如果更简单的比较就足够了(例如,名称、大小和运行时间),那么您只需要下面示例中使用的方法即可。

#!/usr/bin/perl
use strict;
use warnings;

# A utility function to display our results.
sub Print_file_info {
    print map("$_\n", @_), "\n";
}

# Print some basic information about files in a tar.
use Archive::Tar qw();
my $tar_file = 'some_tar_file.tar.gz';
my $tar = Archive::Tar->new($tar_file);
for my $ft ( $tar->get_files ){
    # The variable $ft is an Archive::Tar::File object.
    Print_file_info(
        $ft->name,
        $ft->is_file ? 'file' : 'other',
        $ft->size,
        $ft->mtime,
    );
}

# Print some basic information about files in a directory tree.
use File::Find;
my $dir_name = 'some_directory';
my @files;
find(sub {push @files, $File::Find::name}, $dir_name);
Print_file_info(
    $_,
    -f $_ ? 'file' : 'other',
    -s,
    (stat)[9],
) for @files;

The Archive::Tar and File::Find modules will be helpful. A basic example is shown below. It just prints information about the files in a tar and the files in a directory tree.

It was not clear from your question how you want to compare the files. If you need to compare the actual content, the get_content() method in Archive::Tar::File will likely be needed. If a simpler comparison is adequate (for example, name, size, and mtime), you won't need much more than methods used in the example below.

#!/usr/bin/perl
use strict;
use warnings;

# A utility function to display our results.
sub Print_file_info {
    print map("$_\n", @_), "\n";
}

# Print some basic information about files in a tar.
use Archive::Tar qw();
my $tar_file = 'some_tar_file.tar.gz';
my $tar = Archive::Tar->new($tar_file);
for my $ft ( $tar->get_files ){
    # The variable $ft is an Archive::Tar::File object.
    Print_file_info(
        $ft->name,
        $ft->is_file ? 'file' : 'other',
        $ft->size,
        $ft->mtime,
    );
}

# Print some basic information about files in a directory tree.
use File::Find;
my $dir_name = 'some_directory';
my @files;
find(sub {push @files, $File::Find::name}, $dir_name);
Print_file_info(
    $_,
    -f $_ ? 'file' : 'other',
    -s,
    (stat)[9],
) for @files;
镜花水月 2024-08-08 23:37:25

这可能是一个好的 Perl 程序的良好起点。 但它确实满足了问题的要求。

它只是拼凑在一起,忽略了 Perl 的大部分最佳实践。

perl test.pl full                            \
     Downloads/update-dnsomatic-0.1.2.tar.gz \
     Downloads/                              \
     update-dnsomatic-0.1.2
#! /usr/bin/env perl
use strict;
use 5.010;
use warnings;
use autodie;

use Archive::Tar;
use File::Spec::Functions qw'catfile catdir';

my($action,$file,$directory,$special_dir) = @ARGV;

if( @ARGV == 1 ){
  $file = *STDOUT{IO};
}
if( @ARGV == 3 ){
  $special_dir = '';
}

sub has_file(_);
sub same_size($);
sub find_missing(\%$);

given( lc $action ){

  # only compare names
  when( @{[qw'simple name names']} ){
    my @list = Archive::Tar->list_archive($file);

    say qq'missing file: "$_"' for grep{ ! has_file } @list;
  }

  # compare names, sizes, contents
  when( @{[qw'full aggressive']} ){
    my $next = Archive::Tar->iter($file);
    my( %visited );

    while( my $file = $next->() ){
      next unless $file->is_file;
      my $name = $file->name;
      $visited{$name} = 1;

      unless( has_file($name) ){
        say qq'missing file: "$name"' ;
        next;
      }

      unless( same_size( $name, $file->size ) ){
        say qq'different size: "$name"';
        next;
      }

      next unless $file->size;

      unless( same_checksum( $name, $file->get_content ) ){
        say qq'different checksums: "$name"';
        next;
      }
    }

    say qq'file not in archive: "$_"' for find_missing %visited, $special_dir;
  }

}

sub has_file(_){
  my($file) = @_;
  if( -e catfile $directory, $file ){
    return 1;
  }
  return;
}

sub same_size($){
  my($file,$size) = @_;
  if( -s catfile($directory,$file) == $size ){
    return $size || '0 but true';
  }
  return; # empty list/undefined
}

sub same_checksum{
  my($file,$contents) = @_;
  require Digest::SHA1;

  my($outside,$inside);

  my $sha1 = Digest::SHA1->new;
  {
    open my $io, '<', catfile $directory, $file;
    $sha1->addfile($io);
    close $io;
    $outside = $sha1->digest;
  }

  $sha1->add($contents);
  $inside = $sha1->digest;


  return 1 if $inside eq $outside;
  return;
}

sub find_missing(\%$){
  my($found,$current_dir) = @_;

  my(@dirs,@files);

  {
    my $open_dir = catdir($directory,$current_dir);
    opendir my($h), $open_dir;

    while( my $elem = readdir $h ){
      next if $elem =~ /^[.]{1,2}[\\\/]?$/;

      my $path = catfile $current_dir, $elem;
      my $open_path = catfile $open_dir, $elem;

      given($open_path){
        when( -d ){
          push @dirs, $path;
        }
        when( -f ){
          push @files, $path, unless $found->{$path};
        }
        default{
          die qq'not a file or a directory: "$path"';
        }
      }
    }
  }

  for my $path ( @dirs ){
    push @files, find_missing %$found, $path;
  }

  return @files;
}

config 重命名为 config.rm 后,向 README 添加额外的字符,更改 install.sh 中的字符,并添加文件 .test。 这是它的输出:

missing file: "update-dnsomatic-0.1.2/config"
different size: "update-dnsomatic-0.1.2/README"
different checksums: "update-dnsomatic-0.1.2/install.sh"
file not in archive: "update-dnsomatic-0.1.2/config.rm"
file not in archive: "update-dnsomatic-0.1.2/.test"

This might be a good starting point for a good Perl program. It does what the question asked for though.

It was just hacked together, and ignores most of the best practices for Perl.

perl test.pl full                            \
     Downloads/update-dnsomatic-0.1.2.tar.gz \
     Downloads/                              \
     update-dnsomatic-0.1.2
#! /usr/bin/env perl
use strict;
use 5.010;
use warnings;
use autodie;

use Archive::Tar;
use File::Spec::Functions qw'catfile catdir';

my($action,$file,$directory,$special_dir) = @ARGV;

if( @ARGV == 1 ){
  $file = *STDOUT{IO};
}
if( @ARGV == 3 ){
  $special_dir = '';
}

sub has_file(_);
sub same_size($);
sub find_missing(\%$);

given( lc $action ){

  # only compare names
  when( @{[qw'simple name names']} ){
    my @list = Archive::Tar->list_archive($file);

    say qq'missing file: "$_"' for grep{ ! has_file } @list;
  }

  # compare names, sizes, contents
  when( @{[qw'full aggressive']} ){
    my $next = Archive::Tar->iter($file);
    my( %visited );

    while( my $file = $next->() ){
      next unless $file->is_file;
      my $name = $file->name;
      $visited{$name} = 1;

      unless( has_file($name) ){
        say qq'missing file: "$name"' ;
        next;
      }

      unless( same_size( $name, $file->size ) ){
        say qq'different size: "$name"';
        next;
      }

      next unless $file->size;

      unless( same_checksum( $name, $file->get_content ) ){
        say qq'different checksums: "$name"';
        next;
      }
    }

    say qq'file not in archive: "$_"' for find_missing %visited, $special_dir;
  }

}

sub has_file(_){
  my($file) = @_;
  if( -e catfile $directory, $file ){
    return 1;
  }
  return;
}

sub same_size($){
  my($file,$size) = @_;
  if( -s catfile($directory,$file) == $size ){
    return $size || '0 but true';
  }
  return; # empty list/undefined
}

sub same_checksum{
  my($file,$contents) = @_;
  require Digest::SHA1;

  my($outside,$inside);

  my $sha1 = Digest::SHA1->new;
  {
    open my $io, '<', catfile $directory, $file;
    $sha1->addfile($io);
    close $io;
    $outside = $sha1->digest;
  }

  $sha1->add($contents);
  $inside = $sha1->digest;


  return 1 if $inside eq $outside;
  return;
}

sub find_missing(\%$){
  my($found,$current_dir) = @_;

  my(@dirs,@files);

  {
    my $open_dir = catdir($directory,$current_dir);
    opendir my($h), $open_dir;

    while( my $elem = readdir $h ){
      next if $elem =~ /^[.]{1,2}[\\\/]?$/;

      my $path = catfile $current_dir, $elem;
      my $open_path = catfile $open_dir, $elem;

      given($open_path){
        when( -d ){
          push @dirs, $path;
        }
        when( -f ){
          push @files, $path, unless $found->{$path};
        }
        default{
          die qq'not a file or a directory: "$path"';
        }
      }
    }
  }

  for my $path ( @dirs ){
    push @files, find_missing %$found, $path;
  }

  return @files;
}

After renaming config to config.rm, adding an extra char to README, changing a char in install.sh, and adding a file .test. This is what it outputted:

missing file: "update-dnsomatic-0.1.2/config"
different size: "update-dnsomatic-0.1.2/README"
different checksums: "update-dnsomatic-0.1.2/install.sh"
file not in archive: "update-dnsomatic-0.1.2/config.rm"
file not in archive: "update-dnsomatic-0.1.2/.test"
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文