在尝试组织目录时如何确定分配文件的下一个编号?

发布于 2024-11-17 12:48:35 字数 5203 浏览 3 评论 0原文

我正在编写一个 perl 脚本来组织一个包含所有订单文档的文件夹。除了前几天有人向我扔的一个曲线球之外,该脚本的大部分内容都有效。

问题是当我们有一个最近重做的订单时。我们是土地测量员,有时我们会进行一项调查,然后几年后我们会进行所谓的“飞越”,我们将返回并用另一个文件“附加”订单,要么注意到土地的变化,要么只是简单地说一切都好,什么都没有改变。

这对我来说造成问题的是我们制作/制作的新文件与旧文档具有相同的订单号文档编号。例如,我们可能有一个名为 CF145323 的文档,该文档将有多个名为 CF145323.pdf、*_1.pdf、*_2.pdf 等的单页 PDF 文件。

我正在寻找的是一种修改我的脚本以计算它找到的文件并确定/预测下一个文件编号的方法。因此,如果有 *_1.pdf 到 *_3.pdf。我希望 Perl 获取不匹配的文件并将其设为 *_4.pdf。跟我来?

另一个问题是文件有时位于不同的文件夹中,这些文件夹与文件名中的第一个数字不匹配。这部分我似乎已经弄清楚了,只是我还没有弄清楚。

另外,我在 Windows 中工作,因此无法使用任何 Linux 命令。

这是我上次留下的脚本:

#!/usr/bin/perl
use strict;
use warnings;

# Root folder for Order Documents
my $orders_root = "C:\\Users\\Ian\\Desktop\\Order_docs";

# Keep track of how many files are processed
my $files_counter = 0;

# Keep track of how many junk files are processed
my $junk_counter = 0;

# Store a list of folders that match the 3 number naming scheme
my @matched_folders;

# Create a place to move junk files into
if (! -e "$orders_root\\Junk") {

    system "mkdir $orders_root\\Junk";

}

# Clear the screen
system "cls";

print "Processing files, please wait...\n\n";

# Open $order_dir_root
opendir(ORDERS_ROOT, "$orders_root") or die $!;

# Collect a list of all sub folders
my @folders = readdir(ORDERS_ROOT);

# Close $order_dir_root 
closedir(ORDERS_ROOT);

# Remove the directories "." and ".." from the output
splice @folders, 0, 2;

foreach my $folder (@folders) {

    # Filter out all directories that don't match the numbering system
    if ($folder =~ / \d{3} /xm) {

        # If the folder matches the expression above, add it to the list of matched folders
        push @matched_folders, $folder;
    
        # Open each folder inside of the Order Documents root
        opendir(CURRENT_FOLDER, "$orders_root\\$folder");

        # Foreach folder opened, collect a list of files in the folder for sorting
        my @files = readdir(CURRENT_FOLDER);

        # Close the current folder
        closedir(CURRENT_FOLDER);

        # Remove the directories "." and ".." from the output
        splice @files, 0, 2;

        foreach my $file (@files) {

            # Match each file to the standard naming scheme
            if ($file =~ /^ (C[AFL]|ME) \d{3} \d{3}([_|\-] \d+)? \. pdf /xmi) {

                ++$files_counter;
            
            # If that file does not match, move it to a junk folder
            } else {
            
                ++$junk_counter;

                rename ("$orders_root\\$folder\\$file", "$orders_root\\Junk\\$file");

            } # End pdf match           

        } # End foreach $file
        
    } # End folder match

} # End foreach $folder



foreach my $folder (@matched_folders) {
    
    # Open $folder
    opendir(CURRENT_FOLDER, "$orders_root\\$folder");

    # Collect a list of all sub folders
    my @files = readdir(CURRENT_FOLDER);

    # Close $folder
    closedir(CURRENT_FOLDER);

    splice @files, 0, 2;
    
    foreach my $file (@files) {
        
        if ($file =~ /^ (?<office> (C[AFL]|ME)) (?<folder_num> \d{3}) (?<file_num> \d{3}([_|\-] \d+)?) \. (?<file_ext> pdf) /xmi) {
        
            my $office = uc($+{office});
            my $folder_num = $+{folder_num};
            my $file_num = $+{file_num};
            my $file_ext = lc($+{file_ext});
            
            # Change hyphens to a underscore
            $file_num =~ s/\-/_/;
            
            my $file_name = "$office" . "$folder_num" . "$file_num" . "\." . "$file_ext";
            my $fly_by_name = "$office" . "$folder_num" . "$file_num" . "_FB" . "\." . "$file_ext";
            
            # Check if the current file belongs in the current folder
            if ($folder != $folder_num) {

                # If the folder does not exist create the folder
                if (! -e "$orders_root\\$folder_num") {
                
                    system "mkdir $orders_root\\$folder_num";
                    
                }
                
                # Check to see if the file already exists
                if (! -e "$orders_root\\$folder_num\\$file_name") {
                
                    # Moves the file to correct place, these are mismatched files
                    rename ("$orders_root\\$folder\\$file", "$orders_root\\$folder_num\\$file_name");
                
                } else {
                
                    # Appends the file with a "_#" where # is equal to the 1+ the last file number, these files are fly bys
                    rename ("$orders_root\\$folder\\$file", "$orders_root\\$folder_num\\$fly_by_name");
                
                }
            
            # Files are in the correct place, the file name will be corrected only
            } else {
            
                rename ("$orders_root\\$folder\\$file", "$orders_root\\$folder_num\\$file_name");
            
            }
        
        } # End $file match
        
    } # End foreach $file

} # End foreach $folder



# Show statistics after processing
print "Done!\n\n";
print "$#folders folders processed\n";
print "$files_counter files processed\n";
print "$junk_counter junk files removed\n"

I am working on a perl script to organize a folder we have that holds all of our order documents. The script works for the most part except for a curve ball someone threw at me the other day.

The problem is when we have an order that we have redone recently. We are land surveyors and sometimes we will do a survey and then a few years later we will do what is called a "flyby" where we will go back and "append" the order with another file either noting changes to the land or just simply say that everything is okay and nothing has changed.

Where this causes a problem for me is the new file we make / made has the same order number document number as the older document. For example we might have a documents named CF145323, that document would then have several single page PDF file named CF145323.pdf, *_1.pdf, *_2.pdf and so on.

What I am looking for is a way to modify my script to count the files it finds and determine / predict the next file number to come. So if there was a *_1.pdf through *_3.pdf. I want perl to take the mismatched file and make it a *_4.pdf. Follow me?

The other catch is the files are sometimes in different folders that do not match the first the number in the file name. That part I seem to have figured out it just the numbering I don't have worked out.

Also I am working in Windows so I cannot use any Linux commands.

Here is my script in the last state I left it:

#!/usr/bin/perl
use strict;
use warnings;

# Root folder for Order Documents
my $orders_root = "C:\\Users\\Ian\\Desktop\\Order_docs";

# Keep track of how many files are processed
my $files_counter = 0;

# Keep track of how many junk files are processed
my $junk_counter = 0;

# Store a list of folders that match the 3 number naming scheme
my @matched_folders;

# Create a place to move junk files into
if (! -e "$orders_root\\Junk") {

    system "mkdir $orders_root\\Junk";

}

# Clear the screen
system "cls";

print "Processing files, please wait...\n\n";

# Open $order_dir_root
opendir(ORDERS_ROOT, "$orders_root") or die $!;

# Collect a list of all sub folders
my @folders = readdir(ORDERS_ROOT);

# Close $order_dir_root 
closedir(ORDERS_ROOT);

# Remove the directories "." and ".." from the output
splice @folders, 0, 2;

foreach my $folder (@folders) {

    # Filter out all directories that don't match the numbering system
    if ($folder =~ / \d{3} /xm) {

        # If the folder matches the expression above, add it to the list of matched folders
        push @matched_folders, $folder;
    
        # Open each folder inside of the Order Documents root
        opendir(CURRENT_FOLDER, "$orders_root\\$folder");

        # Foreach folder opened, collect a list of files in the folder for sorting
        my @files = readdir(CURRENT_FOLDER);

        # Close the current folder
        closedir(CURRENT_FOLDER);

        # Remove the directories "." and ".." from the output
        splice @files, 0, 2;

        foreach my $file (@files) {

            # Match each file to the standard naming scheme
            if ($file =~ /^ (C[AFL]|ME) \d{3} \d{3}([_|\-] \d+)? \. pdf /xmi) {

                ++$files_counter;
            
            # If that file does not match, move it to a junk folder
            } else {
            
                ++$junk_counter;

                rename ("$orders_root\\$folder\\$file", "$orders_root\\Junk\\$file");

            } # End pdf match           

        } # End foreach $file
        
    } # End folder match

} # End foreach $folder



foreach my $folder (@matched_folders) {
    
    # Open $folder
    opendir(CURRENT_FOLDER, "$orders_root\\$folder");

    # Collect a list of all sub folders
    my @files = readdir(CURRENT_FOLDER);

    # Close $folder
    closedir(CURRENT_FOLDER);

    splice @files, 0, 2;
    
    foreach my $file (@files) {
        
        if ($file =~ /^ (?<office> (C[AFL]|ME)) (?<folder_num> \d{3}) (?<file_num> \d{3}([_|\-] \d+)?) \. (?<file_ext> pdf) /xmi) {
        
            my $office = uc($+{office});
            my $folder_num = $+{folder_num};
            my $file_num = $+{file_num};
            my $file_ext = lc($+{file_ext});
            
            # Change hyphens to a underscore
            $file_num =~ s/\-/_/;
            
            my $file_name = "$office" . "$folder_num" . "$file_num" . "\." . "$file_ext";
            my $fly_by_name = "$office" . "$folder_num" . "$file_num" . "_FB" . "\." . "$file_ext";
            
            # Check if the current file belongs in the current folder
            if ($folder != $folder_num) {

                # If the folder does not exist create the folder
                if (! -e "$orders_root\\$folder_num") {
                
                    system "mkdir $orders_root\\$folder_num";
                    
                }
                
                # Check to see if the file already exists
                if (! -e "$orders_root\\$folder_num\\$file_name") {
                
                    # Moves the file to correct place, these are mismatched files
                    rename ("$orders_root\\$folder\\$file", "$orders_root\\$folder_num\\$file_name");
                
                } else {
                
                    # Appends the file with a "_#" where # is equal to the 1+ the last file number, these files are fly bys
                    rename ("$orders_root\\$folder\\$file", "$orders_root\\$folder_num\\$fly_by_name");
                
                }
            
            # Files are in the correct place, the file name will be corrected only
            } else {
            
                rename ("$orders_root\\$folder\\$file", "$orders_root\\$folder_num\\$file_name");
            
            }
        
        } # End $file match
        
    } # End foreach $file

} # End foreach $folder



# Show statistics after processing
print "Done!\n\n";
print "$#folders folders processed\n";
print "$files_counter files processed\n";
print "$junk_counter junk files removed\n"

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

謸气贵蔟 2024-11-24 12:48:35

您的脚本相当大,需要费力才能完成,但我建议采用不同的方法。

首先,也是最明显的,是这样的:

my $base = "CF145323";
my $num = 1;
$num++ while -f "${base}_$num.pdf";

my $filename = "${base}_$num.pdf";
print "$filename\n";

换句话说,查看文件是否已经存在。您必须修改它来测试保存文件的各个目录,如果编号序列中有间隙,则这将不起作用。

保留每个文件及其最新一代的记录可能会更容易。通常,这将位于哈希中,例如使用“CF145323”作为键,并使用最新版本号作为其值。可以使用 Storable 模块保存和恢复哈希值(非常容易使用,并且在 Perl 基础中)。

Your script is rather large to wade through, but I suggest a different approach.

First, and most obvious, is something like this:

my $base = "CF145323";
my $num = 1;
$num++ while -f "${base}_$num.pdf";

my $filename = "${base}_$num.pdf";
print "$filename\n";

In other words, see if the file already exists. You would have to modify this to test the various directories you hold the files in, and this won't work if there are gaps in the numbering sequence.

It might be easier to keep a record of each file and the latest generation of it. Typically that would be in a hash, using, for example, 'CF145323' as the key and the latest version number as its value. The hash can be saved and restored using the Storable module (very easy to use, and in the Perl base).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文