Unix:快速“删除目录”用于清理日常构建

发布于 2024-08-12 12:21:17 字数 385 浏览 10 评论 0原文

有没有比简单提交更快的方法来删除目录

rm -r -f *directory*

?我问这个问题是因为我们每天的跨平台构建非常巨大(例如每个构建 4GB)。所以有些机器上的硬盘经常出现空间不足的情况。

我们的 AIX 和 Solaris 平台就是这种情况。

也许这些平台上有用于目录删除的“特殊”命令?

粘贴编辑(将我自己的单独答案移到问题中):

我通常想知道为什么“rm -r -f”如此慢。 'rm' 不是只需要修改 '..' 或 '.'文件来取消分配文件系统条目。

类似的东西

mv *directory* /dev/null

会很好。

Is there a faster way to remove a directory then simply submitting

rm -r -f *directory*

? I am asking this because our daily cross-platform builds are really huge (e.g. 4GB per build). So the harddisks on some of the machines are frequently running out of space.

This is namely the case for our AIX and Solaris platforms.

Maybe there are 'special' commands for directory remove on these platforms?

PASTE-EDIT (moved my own separate answer into the question):

I am generally wondering why 'rm -r -f' is so slow. Doesn't 'rm' just need to modify the '..' or '.' files to de-allocate filesystem entries.

something like

mv *directory* /dev/null

would be nice.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

来日方长 2024-08-19 12:21:17

要从文件系统中删除目录,rm 是最快的选择。
在 Linux 上,有时我们在 ramdisk 中进行构建(几 GB),并且它的删除速度非常令人印象深刻:) 您也可以尝试不同的文件系统,但在 AIX/Solaris 上您可能没有太多选择...

如果您的目标就是让目录 $dir 现在为空,您可以重命名它,然后从后台/cron 作业中删除它:

mv "$dir" "$dir.old"
mkdir "$dir"
# later
rm -r -f "$dir.old"

另一个技巧是为 $dir 创建一个单独的文件系统,当您需要时要删除它,您只需重新创建文件系统即可。像这样的东西:

# initialization
mkfs.something /dev/device
mount /dev/device "$dir"


# when you want to delete it:
umount "$dir"
# re-init
mkfs.something /dev/device
mount /dev/device "$dir"

For deleting a directory from a filesystem, rm is your fastest option.
On linux, sometimes we do our builds (few GB) in a ramdisk, and it has a really impressive delete speed :) You could also try different filesystems, but on AIX/Solaris you may not have many options...

If your goal is to have the directory $dir empty now, you can rename it, and delete it later from a background/cron job:

mv "$dir" "$dir.old"
mkdir "$dir"
# later
rm -r -f "$dir.old"

Another trick is that you create a seperate filesystem for $dir, and when you want to delete it, you just simply re-create the filesystem. Something like this:

# initialization
mkfs.something /dev/device
mount /dev/device "$dir"


# when you want to delete it:
umount "$dir"
# re-init
mkfs.something /dev/device
mount /dev/device "$dir"
尘曦 2024-08-19 12:21:17

我忘记了这个技巧的来源,但它确实有效:

EMPTYDIR=$(mktemp -d)
rsync -r --delete $EMPTYDIR/ dir_to_be_emptied/

I forgot the source of this trick but it works:

EMPTYDIR=$(mktemp -d)
rsync -r --delete $EMPTYDIR/ dir_to_be_emptied/
心如荒岛 2024-08-19 12:21:17

rm -r 目录 的工作原理是通过深度优先向下递归目录,删除文件,并在备份过程中删除目录。它必须这样做,因为您无法删除不为空的目录。

又长又无聊的细节:每个文件系统对象都由文件系统中的一个 inode 表示,该 inode 具有文件系统范围内的平面 inode 数组。 [1]如果您只是删除目录而没有先删除其子级,那么子级将保持分配状态,但没有任何指向它们的指针。 (fsck 在运行时检查此类情况,因为它代表文件系统损坏。)

[1] 对于每个文件系统来说,这可能并非严格正确,并且可能存在一个文件系统按照你描述的方式工作。它可能需要诸如垃圾收集器之类的东西。然而,我所知道的所有常见的act(例如 fs 对象)都由 inode 拥有,而目录是名称/inode 编号对的列表。

rm -r directory works by recursing depth-first down through directory, deleting files, and deleting the directories on the way back up. It has to, since you cannot delete a directory that is not empty.

Long, boring details: Each file system object is represented by an inode in the file system, which has file system-wide, flat array of inodes.[1] If you just deleted directory without first deleting its children then the children would remain allocated, but without any pointers to them. (fsck checks for that kind of thing when it runs, since it represents file system damage.)

[1] That may not be strictly true for every file system out there, and there may be a file system that works the way you describe. It would possibly require something like a garbage collector. However, all the common ones I know of act like fs objects are owned by inodes, and directories are lists of name/inode number pairs.

缘字诀 2024-08-19 12:21:17

至少在 AIX 上,您应该使用 LVM(逻辑卷管理器)。我们所有的系统都将所有物理硬盘驱动器捆绑到一个卷组中,然后从中创建一个大文件系统。

这样,您就可以随意向计算机添加物理设备,并将文件系统的大小增加到您需要的大小。

我见过的另一种解决方案是在每个文件系统上分配一个垃圾目录,并使用 mvfind cron 作业的组合来解决空间问题。

基本上,有一个每十分钟运行一次的 cron 作业并执行:

rm -rf /trash/*
rm -rf /filesys1/trash/*
rm -rf /filesys2/trash/*

然后,当您希望回收该文件系统上的特定目录时,请使用类似以下内容:

mv /filesys1/overnight /filesys1/trash/overnight

并且,在接下来的十分钟内,您的磁盘空间将开始被恢复。即使在开始删除已删除的版本之前,filesys1/overnight 目录也将立即可供使用。

重要的是,垃圾目录与您想要删除的目录位于同一文件系统上,否则您将面临大量的复制/删除操作,而不是相对快速的移动。

On AIX at least, you should be using LVM, the logical volume manager. All our systems bundle all the physical hard drive into a single volume group and then create one big honkin' file system out of that.

That way, you can add physical devices to your machine at will and increase the size of your file system to whatever you need.

One other solution I've seen is to allocate a trash directory on each file system and use a combination of mv and a find cron job to tackle the space problem.

Basically, have a cron job that runs every ten minutes and executes:

rm -rf /trash/*
rm -rf /filesys1/trash/*
rm -rf /filesys2/trash/*

Then, when you want your specific directory on that file system recycled, use something like:

mv /filesys1/overnight /filesys1/trash/overnight

and, within the next ten minutes your disk space will start being recovered. The filesys1/overnight directory will immediately be available for use even before the trashed version has started being deleted.

It's important that the trash directory be on the same filesystem as the directory you want to get rid of, otherwise you have a massive copy/delete operation on your hands rather than a relatively quick move.

白云悠悠 2024-08-19 12:21:17

如果 rm -rf 很慢,则可能您正在使用“同步”选项或类似选项,这导致写入磁盘过于频繁。在具有正常选项的 Linux ext3 上,rm -rf 非常快。

快速删除的一种选择是使用循环设备,它可以在 Linux 上工作,也可能在各种 Unixen 上工作,例如:

hole temp.img $[5*1024*1024*1024]  # create a 5Gb "hole" file
mkfs.ext3 temp.img
mkdir -p mnt-temp
sudo mount temp.img mnt-temp -o loop

“hole”程序是我自己编写的一个,用于使用“hole”而不是分配的文件来创建一个大的空文件磁盘上的块,这要快得多,并且在您真正需要之前不会使用任何磁盘空间。 http://sam.nipl.net/coding/c-examples/hole。 c

我刚刚注意到 GNU coreutils 包含一个类似的程序“truncate”,所以如果你有这个程序,你可以使用它来创建映像:

truncate --size=$[5*1024*1024*1024] temp.img

现在你可以使用 mnt-temp 下挂载的映像进行临时存储,以供你使用建造。完成后,执行以下操作将其删除:

sudo umount mnt-temp
rm test.img
rmdir mnt-temp

我想您会发现删除单个大文件比删除大量小文件要快得多!

如果你不想编译我的“hole.c”程序,你可以使用 dd,但这要慢得多:

dd if=/dev/zero of=temp.img bs=1024 count=$[5*1024*1024]  # create a 5Gb allocated file

If rm -rf is slow, perhaps you are using a "sync" option or similar, which is writing to the disk too often. On Linux ext3 with normal options, rm -rf is very quick.

One option for fast removal which would work on Linux and presumably also on various Unixen is to use a loop device, something like:

hole temp.img $[5*1024*1024*1024]  # create a 5Gb "hole" file
mkfs.ext3 temp.img
mkdir -p mnt-temp
sudo mount temp.img mnt-temp -o loop

The "hole" program is one I wrote myself to create a large empty file using a "hole" rather than allocated blocks on the disk, which is much faster and doesn't use any disk space until you really need it. http://sam.nipl.net/coding/c-examples/hole.c

I just noticed that GNU coreutils contains a similar program "truncate", so if you have that you can use this to create the image:

truncate --size=$[5*1024*1024*1024] temp.img

Now you can use the mounted image under mnt-temp for temporary storage, for your build. When you are done with it, do this to remove it:

sudo umount mnt-temp
rm test.img
rmdir mnt-temp

I think you will find that removing a single large file is much quicker than removing lots of little files!

If you don't care to compile my "hole.c" program, you can use dd, but this is much slower:

dd if=/dev/zero of=temp.img bs=1024 count=$[5*1024*1024]  # create a 5Gb allocated file
捂风挽笑 2024-08-19 12:21:17

我认为实际上除了您引用的删除目录的“rm -rf”之外没有其他任何东西。

为了避免一遍又一遍地手动执行此操作,您可以每天执行一个脚本,该脚本会递归删除构建根目录的所有构建目录(如果它们“足够旧”),例如:(

find <buildRootDir>/* -prune -mtime +4 -exec rm -rf {} \;

此处 mtime +4 表示“任何超过 4 的文件”天)

另一种方法是配置您的构建器(如果它允许这样的事情)用当前构建来粉碎以前的构建。

I think that actually there is nothing else than "rm -rf" as you quoted to delete your directories.

to avoid doing it manually over and over you can cron daily a script that recursively deletes all the build directories of your build root directory if they're "old enough" with something like :

find <buildRootDir>/* -prune -mtime +4 -exec rm -rf {} \;

(here mtime +4 indicates "any file older than 4 days)

Another way would be to configure your builder (if it allows such things) to crush the previous build with the current one.

柏拉图鍀咏恒 2024-08-19 12:21:17

我也在研究这个。

我有一个包含 600,000 多个文件的目录。

rm * 会失败,因为条目太多。

<代码>查找 . -exec rm {} \; 很好,每 5 秒删除约 750 个文件。正在通过另一个 shell 检查 rm 汇率。

因此,我编写了一个简短的脚本来一次管理多个文件。每 5 秒获取约 1000 个文件。这个想法是在 1 个 rm 命令中放入尽可能多的文件以提高效率。

#!/usr/bin/ksh
string="";
count=0;
for i in $(cat filelist);do
    string="$string $i";
    count=$(($count + 1));
  if [[ $count -eq 40 ]];then
    count=1;
    rm $string
    string="";
  fi
done

I was looking into this as well.

I had a dir with 600,000+ files.

rm * would fail, because there are too many entries.

find . -exec rm {} \; was nice, and deleting ~750 files every 5 seconds. Was checking the rm rate via another shell.

So, instead I wrote a short script to rm many files at once. Which obtained about ~1000 files every 5 seconds. The idea is to put as many files into 1 rm command as you can to increase the efficiency.

#!/usr/bin/ksh
string="";
count=0;
for i in $(cat filelist);do
    string="$string $i";
    count=$(($count + 1));
  if [[ $count -eq 40 ]];then
    count=1;
    rm $string
    string="";
  fi
done
酒废 2024-08-19 12:21:17

在 Solaris 上,这是我发现的最快的方法。

find /dir/to/clean -type f|xargs rm

如果您的文件具有奇数路径,请使用

find /dir/to/clean -type f|while read line; do echo "$line";done|xargs rm 

On Solaris, this is the fastest way I have found.

find /dir/to/clean -type f|xargs rm

If you have files with odd paths, use

find /dir/to/clean -type f|while read line; do echo "$line";done|xargs rm 
π浅易 2024-08-19 12:21:17

使用
perl -e 'for(<*>){((stat)[9]<(unlink))}'
请参考以下链接:
http://www.slashroot.in /linux 中删除文件最快的方法

Use
perl -e 'for(<*>){((stat)[9]<(unlink))}'
Please refer below link:
http://www.slashroot.in/which-is-the-fastest-method-to-delete-files-in-linux

蹲墙角沉默 2024-08-19 12:21:17

需要从 AWS EBS 1 TB 磁盘 (ext3) 上的数十个目录中删除 700 GB,然后才能将剩余部分复制到新的 200 GB XFS 卷。将该音量保持在 100%wa 需要几个小时。由于磁盘 IO 和服务器时间不是免费的,因此每个目录只花费了几分之一秒的时间。

其中 /dev/sdb
是任意大小的空卷

directory_to_delete=/ebs/var/tmp/

mount /dev/sdb $directory_to_delete

nohup rsync -avh /ebs/ /ebs2/

Needed to delete 700 Gbytes from dozens of directories on AWS EBS 1 TB disk (ext3) before copying remainder to a new 200 Gbyte XFS volume. It was taking hours leaving that volume at 100%wa. Since the disk IO and server time are not free, this took only a fraction of a second per directory.

where /dev/sdb
is an empty volume of any size

directory_to_delete=/ebs/var/tmp/

mount /dev/sdb $directory_to_delete

nohup rsync -avh /ebs/ /ebs2/

吻泪 2024-08-19 12:21:17

我编写了一个小型 Java 应用程序 RdPro(递归目录清除工具),它比 rm 更快。它还可以删除用户在 root 下指定的目标目录。适用于 Linux/Unix 和 Windows。它有命令行版本和 GUI 版本。

https://github.com/mhisoft/rdpro

I coded a small Java application RdPro (Recursive Directory Purge tool) which is faster than rm. It also can remove target directories user specified under a root.Works for both Linux/Unix and Windows. It has both a command line version and a GUI version.

https://github.com/mhisoft/rdpro

沉溺在你眼里的海 2024-08-19 12:21:17

我不得不在 Windows 中删除超过 3,00,000 个文件。我安装了cygwin。幸运的是,我在数据库中拥有所有主目录。创建一个 for 循环并基于行输入和使用 rm -rf 删除

I had to delete more than 3,00,000 files in windows. I had cygwin installed. Luckily i had all the primary directory in a database. Created a for loop and based on line entry and delete using rm -rf

静若繁花 2024-08-19 12:21:17

我只是在文件夹中使用 find ./ -delete 进行清空,大约 10 分钟就删除了 620000 个目录(总大小)100GB。

来源:本站评论 https://www.slashroot.in/comment/1286 #comment-1286

I just use find ./ -delete in the folder to empty, and it has deleted 620000 directories (total size) 100GB in arround 10 minutes.

Source : a comment in this site https://www.slashroot.in/comment/1286#comment-1286

凶凌 2024-08-19 12:21:17

如果您想从 Linux 服务器中删除数十万个日常构建的小文件(这将消耗您的有用内存),下面的 Perl 命令对我帮助很大。这是我所知道和使用的最有效的方法。您可以尝试这个作为您的解决方案。

cd /home/admin/tmp or cd yourdirectory    
perl -e 'for(<*>){((stat)[9]<(unlink))}'

注意:大部分用法都在/home/admin/tmp目录中。还,
我不建议在 crontab 中使用该命令。您可以使用 find 来代替,因为它更具可定制性,例如您可以检查和删除超过 30 分钟的会话文件。

If you would like to remove hundreds of thousands of daily build tiny files ( that will consume your useful memory ) from your Linux server, the below Perl command helped me a lot. This is the most efficient way I know and use. You might try this for your solution.

cd /home/admin/tmp or cd yourdirectory    
perl -e 'for(<*>){((stat)[9]<(unlink))}'

Note: Most of the usage is in the /home/admin/tmp directory. Also,
I do not recommend using that command in crontab. You can use find instead as it is more customizable and you can check and delete session files older than 30 minutes for example.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文