我需要制作一个大目录,其中包含多个可移植的子目录
我的数据集包含在一个包含大约 30,000 个子目录的目录中。每个目录都包含一个文本文件和另一个子目录。该子目录包含一定数量的文本文件(范围从 0 个文本文件到数百个)。我的许多同事都使用这个数据集,但实际上,将数据集从实验室的一台计算机/硬盘传输到另一台计算机/硬盘至少需要 6 个小时 - 不是因为数据集的大小,而是因为繁琐它的存储格式。我想创建一些存档(例如.tar.gz)来存储这些数据,以便它们可以在计算机之间快速传输。我想看看是否有人以前处理过类似的事情,并且可以告诉我最快、最好的方法?我认为 shell 脚本可能比自己创建存档更快。
I have a dataset contained in a directory that has about 30,000 sub-directories. Each of these directories contains a text file and another sub-directory. This sub-directory contains some number of text files (ranging from 0 text files, to hundreds). Many of my colleagues use this dataset, but as it is it takes at least 6 hours to transfer the dataset from one of the computers/hard disks in the lab to another - not because of the size of the dataset, but because of the cumbersome format in which it is stored. I would like to create some archive (such as .tar.gz) to store these data such that they can be quickly transfered between computers. I wanted to see if anyone has worked with something like this before and can tell me the fastest, best way to do it? I am thinking that a shell-script might be quicker than just creating the archive myself.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
建议:NFS挂载目录。然后 Windows 盒子或 UNIX 盒子就可以访问该目录。
评论:这样的目录结构对于文件系统中的 inode 来说是个坏消息,并且还会增加搜索时间。
答:这适用于任何兼容 POSIX 的 unix 机器,并假设您的存储库只有一个基本目录——
这会创建一个相对路径 tar 存档——这意味着您可以将其解压到一个低级目录,而不是离开根目录。
Suggestion: NFS mount the directory. Then a windows box or a unix box can access the directory.
Comment: directory structures like that are bad news on inodes in a filesystem, and increase search times as well.
Answer: This will work on any POSIX compliant unix box, and assumes there is just one base directory for your repository--
This creates a relative path tar archive - meaning you can unpack it to a low-level directory, instead of off the root.