UNIX 将内容解压到多个文件夹中
我有一个大约 13GB 大小的 tar.gz 文件。它包含大约 120 万份文档。当我解压这些文件时,所有这些文件都位于一个目录中从此目录进行任何读取都需要很长时间。有什么方法可以将 tar 中的文件拆分到多个新文件夹中吗?
例如:我想创建名为 [1,2,...] 的新文件夹,每个文件夹包含 1000 个文件。
I have a tar.gz file about 13GB in size. It contains about 1.2 million documents. When I untar this all these files sit in one single directory & any reads from this directory takes ages. Is there any way I can split the files from the tar into multiple new folders?
e.g.: I would like to create new folders named [1,2,...] each having 1000 files.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这是一个快速但肮脏的解决方案,但它在 Bash 中完成工作而不使用任何临时文件。
与单行相同:
根据您的 shell 设置,用于检索 tar 内容输出的最后一列(文件名)的“cut -d ' ' -f12”部分可能会导致问题,您必须对其进行修改。
它适用于 1000 个文件,但如果您的存档中有 120 万个文档,请考虑首先使用较小的文件进行测试。
This is a quick and dirty solution but it does the job in Bash without using any temporary files.
Same as a one liner:
Depending on your shell settings the "cut -d ' ' -f12" part for retrieving the last column (filename) of tar's content output could cause a problem and you would have to modify that.
It worked with 1000 files but if you have 1.2 million documents in the archive, consider testing this with something smaller first.
因此:
Thus:
如果您有 GNU
tar
,您也许可以使用--checkpoint
和--checkpoint-action
选项。我还没有测试过这个,但我在想:If you have GNU
tar
you might be able to make use of the--checkpoint
and--checkpoint-action
options. I have not tested this, but I'm thinking something like:你可以查看手册页,看看是否有类似的选项。最糟糕的是,只需提取您需要的文件(也许使用 --exclude )并将它们放入您的文件夹中。
you can look at the man page and see if there are options like that. worst comes to worst, just extract the files you need (maybe using --exclude ) and put them into your folders.
tar 不直接提供该功能。它仅将其文件恢复到最初生成时的相同结构。
您可以修改源目录以在其中创建所需的结构,然后 tar 树吗?如果没有,您可以按原样解压文件,然后使用脚本对该目录进行后处理,将文件移动到所需的排列中。考虑到文件的数量,这将需要一些时间,但至少可以在后台完成。
tar doesn't provide that capability directly. It only restores its files into the same structure from which it was originally generated.
Can you modify the source directory to create the desired structure there and then tar the tree? If not, you could untar the files as they are in the file and then post-process that directory using a script to move the files into the desired arrangement. Given the number of files, this will take some time but at least it can be done in the background.