将文件夹列表减少到最低的常用文件夹
我有一个巨大的文件路径列表,对于我们的 SCM 来说太大了,无法处理。我需要根据最低的通用级别文件夹来削减它们。例如,给定以下路径:
//folder1/folder2/folder2
//folder1/folder2/folder5
//folder1/folder3/folder6
//folderx/foldery/folder9
//folderx/foldery/folder10
基于此,我想得到以下结果:
//folder1/folder2
//folder1/folder3
//folderx/foldery
文件夹列表将从文本文件中读取,并且行长约为 2M。
任何帮助将不胜感激。
I have a giant list of file paths that are simply too large for our SCM to process. I need to whittle them down based on the lowest common level folder. For example, given the following paths:
//folder1/folder2/folder2
//folder1/folder2/folder5
//folder1/folder3/folder6
//folderx/foldery/folder9
//folderx/foldery/folder10
Based on that, I would like to arrive at this:
//folder1/folder2
//folder1/folder3
//folderx/foldery
The folder list will be read from a text file, and is around 2M line long.
Any help would be greatly appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这看起来是 split() 和哈希的一个很好的用途:
如果您只想打印已经见过两次或多次的路径,请插入
next if $seen{$rootpath} {$第二路径}> 1;
在print()
之前。我还没有对此进行测试,因此可能存在语法错误,但代码给出了一般要点。
This looks to be a good use for
split()
and hashes:If you only want to print out paths that have been seen twice or more, insert a
next if $seen{$rootpath}{$secondpath} > 1;
before theprint()
.I haven't tested this so there could be syntax errors, but the code gives the general gist.
怎么样:
输出:
How about:
output: