如何查找相同大小的文件?
我有一个像这样的文件结构,
a/file1
a/file2
a/file3
a/...
b/file1
b/file2
b/file3
b/...
...
在每个目录中,一些文件具有相同的文件大小,我想删除它们。
我想如果这个问题可以解决一个目录,例如 dir a
,那么我可以在它周围包裹一个 for 循环吗?
for f in *; do
???
done
但是如何找到相同大小的文件呢?
I have a file structure like so
a/file1
a/file2
a/file3
a/...
b/file1
b/file2
b/file3
b/...
...
where within each dir, some files have the same file size, and I would like to delete those.
I guess if the problem could be solved for one dir e.g. dir a
, then I could wrap a for-loop around it?
for f in *; do
???
done
But how do I find files with same size?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
这只会检查文件,不会检查目录。
的大小
$5 是 ls 命令test
: 根据 Michał Šrajer 的评论进行更新:
现在也支持带空格的文件名
command:
test:
this will only check files, no directories.
$5 is the size of ls command
test:
update based on Michał Šrajer 's comment:
Now filenames with spaces are also supported
command:
test:
使用“带空格的文件名”的解决方案(基于 Kent (+1) 和 awiebe (+1) 帖子):
要使其删除重复项,请从 xargs 中删除
echo
。Solution working with "file names with spaces" (based on Kent (+1) and awiebe (+1) posts):
to make it remove duplicates, remove
echo
from xargs.如果您需要文件的大小,请使用以下代码:
然后使用 for 循环获取结构中的第一项,
将该文件的大小存储在变量中。
在该 for 循环中将 for 循环嵌套到结构中的每个项目(不包括当前项目)到当前项目。
将所有相同文件的名称路由到一个文本文件中,以确保您正确编写了脚本(而不是立即执行 rm)。
对此文件的内容执行 rm。
Here is code if you need the size of a file:
Then use a for loop to get the first item in your structure,
Store the size of that file in a variable.
Nest a for loop in that for loop to each item in your structure(excluding the current item) to the current item.
Route all the names of identical files into a text file to ensure you have written you script correctly(insteed of executing rm immediately) .
Execute rm on the contents of this file.
简单的 bash 解决方案
Plain bash solution
根据接受的答案,下面提供了当前目录中所有相同大小的文件的列表(以便您可以选择保留哪个文件),按大小排序:
确定文件是否实际上相同,而不仅仅是相同包含相同数量的字节,对每个文件执行
shasum
或md5sum
:Based on the accepted answer, the following provides a list of all the files of the same size in the current directory (so you can choose which one to keep), sorted by size:
To determine if the files are actually the same, not just the contain the same number of bytes, do an
shasum
ormd5sum
on each file:看来您真正想要的是重复文件查找器?
Looks like what you really want is a duplicate file finder?
听起来这个问题已经以几种不同的方式得到了多次回答,所以我可能已经死了,但这里是...
find DIR_TO_RUN_ON -size SIZE_OF_FILE_TO_MATCH -exec rm {} \;
find 是一个很棒的命令,我强烈建议阅读它的联机帮助页。
It sounds like this has been answered several times and in several different ways, so I may be beating a dead horse but here goes...
find DIR_TO_RUN_ON -size SIZE_OF_FILE_TO_MATCH -exec rm {} \;
find is an awesome command and I highly recommend reading its manpage.