将文件夹列表减少到最低的常用文件夹

发布于 2024-12-07 09:52:36 字数 385 浏览 0 评论 0原文

我有一个巨大的文件路径列表,对于我们的 SCM 来说太大了,无法处理。我需要根据最低的通用级别文件夹来削减它们。例如,给定以下路径:

//folder1/folder2/folder2
//folder1/folder2/folder5
//folder1/folder3/folder6
//folderx/foldery/folder9
//folderx/foldery/folder10

基于此,我想得到以下结果:

//folder1/folder2
//folder1/folder3
//folderx/foldery

文件夹列表将从文本文件中读取,并且行长约为 2M。

任何帮助将不胜感激。

I have a giant list of file paths that are simply too large for our SCM to process. I need to whittle them down based on the lowest common level folder. For example, given the following paths:

//folder1/folder2/folder2
//folder1/folder2/folder5
//folder1/folder3/folder6
//folderx/foldery/folder9
//folderx/foldery/folder10

Based on that, I would like to arrive at this:

//folder1/folder2
//folder1/folder3
//folderx/foldery

The folder list will be read from a text file, and is around 2M line long.

Any help would be greatly appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

古镇旧梦 2024-12-14 09:52:36

这看起来是 split() 和哈希的一个很好的用途:

use strict;
use warnings;

my %seen;
foreach my $path ( @paths ) {
  $path =~ s|^//||; # Strip off leading //
  my @elems = split( '/', $path );
  $seen{$elems[0]}{$elems[1]}++;
}

foreach my $rootpath ( sort keys %seen ) {
  foreach my $secondpath ( sort keys %{$seen{$rootpath}} ) {
    print "//" . $rootpath . "/" . $secondpath . "\n";
  }
}

如果您只想打印已经见过两次或多次的路径,请插入 next if $seen{$rootpath} {$第二路径}> 1;print() 之前。

我还没有对此进行测试,因此可能存在语法错误,但代码给出了一般要点。

This looks to be a good use for split() and hashes:

use strict;
use warnings;

my %seen;
foreach my $path ( @paths ) {
  $path =~ s|^//||; # Strip off leading //
  my @elems = split( '/', $path );
  $seen{$elems[0]}{$elems[1]}++;
}

foreach my $rootpath ( sort keys %seen ) {
  foreach my $secondpath ( sort keys %{$seen{$rootpath}} ) {
    print "//" . $rootpath . "/" . $secondpath . "\n";
  }
}

If you only want to print out paths that have been seen twice or more, insert a next if $seen{$rootpath}{$secondpath} > 1; before the print().

I haven't tested this so there could be syntax errors, but the code gives the general gist.

一场信仰旅途 2024-12-14 09:52:36

怎么样:

#!/usr/local/bin/perl 
use strict;
use warnings;
use 5.010;

my %out;
while(<DATA>) {
    chomp;
    m#^(//[^/]+/[^/]+)#;
    $out{$1} = 1;
}
say for keys%out;

__DATA__
//folder1/folder2/folder2
//folder1/folder2/folder5
//folder1/folder3/folder6
//folderx/foldery/folder9
//folderx/foldery/folder10

输出:

//folderx/foldery
//folder1/folder3
//folder1/folder2

How about:

#!/usr/local/bin/perl 
use strict;
use warnings;
use 5.010;

my %out;
while(<DATA>) {
    chomp;
    m#^(//[^/]+/[^/]+)#;
    $out{$1} = 1;
}
say for keys%out;

__DATA__
//folder1/folder2/folder2
//folder1/folder2/folder5
//folder1/folder3/folder6
//folderx/foldery/folder9
//folderx/foldery/folder10

output:

//folderx/foldery
//folder1/folder3
//folder1/folder2
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文