聚合测量某些文件类型的磁盘空间

发布于 2024-08-03 09:19:26 字数 428 浏览 14 评论 0原文

我的一些文件分布在多个文件夹中：

/home/d/folder1/a.txt
/home/d/folder1/b.txt
/home/d/folder1/c.mov
/home/d/folder2/a.txt
/home/d/folder2/d.mov
/home/d/folder2/folder3/f.txt

如何测量 /home/d/ 中所有 .txt 文件占用的磁盘空间总量？

我知道 du 会给我给定文件夹的总空间，ls -l 会给我单个文件的总空间，但是如果我想加起来怎么办所有 txt 文件，然后查看 /home/d/ 中所有 .txt 的所有 .txt 文件占用的空间，包括folder1和folder2及其子文件夹（例如folder3）？

原文

I have some files across several folders:

/home/d/folder1/a.txt
/home/d/folder1/b.txt
/home/d/folder1/c.mov
/home/d/folder2/a.txt
/home/d/folder2/d.mov
/home/d/folder2/folder3/f.txt

How can I measure the grand total amount of disk space taken up by all the .txt files in /home/d/?

I know du will give me the total space of a given folder, and ls -l will give me the total space of individual files, but what if I want to add up all the txt files and just look at the space taken by all .txt files in one giant total for all .txt in /home/d/ including both folder1 and folder2 and their subfolders like folder3?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

雨轻弹 2024-08-10 09:19:26

查找文件夹 1 文件夹 2 -iname '*.txt' -print0 | du --files0-from - -c -s | 杜--files0-from--c -s |尾部-1

回复收藏 0 原文

时光与爱终年不遇 2024-08-10 09:19:26

这将按扩展名报告磁盘空间使用情况（以字节为单位）：

find . -type f -printf "%f %s\n" |
  awk '{
      PARTSCOUNT=split( $1, FILEPARTS, "." );
      EXTENSION=PARTSCOUNT == 1 ? "NULL" : FILEPARTS[PARTSCOUNT];
      FILETYPE_MAP[EXTENSION]+=$2
    }
   END {
     for( FILETYPE in FILETYPE_MAP ) {
       print FILETYPE_MAP[FILETYPE], FILETYPE;
      }
   }' | sort -n

输出：

3250 png
30334451 mov
57725092729 m4a
69460813270 3gp
79456825676 mp3
131208301755 mp4

This will report disk space usage in bytes by extension:

find . -type f -printf "%f %s\n" |
  awk '{
      PARTSCOUNT=split( $1, FILEPARTS, "." );
      EXTENSION=PARTSCOUNT == 1 ? "NULL" : FILEPARTS[PARTSCOUNT];
      FILETYPE_MAP[EXTENSION]+=$2
    }
   END {
     for( FILETYPE in FILETYPE_MAP ) {
       print FILETYPE_MAP[FILETYPE], FILETYPE;
      }
   }' | sort -n

Output:

3250 png
30334451 mov
57725092729 m4a
69460813270 3gp
79456825676 mp3
131208301755 mp4

回复收藏 0 原文

甜是你 2024-08-10 09:19:26

简单：

du -ch *.txt

如果您只想显示所占用的总空间，那么：

du -ch *.txt | tail -1

Simple:

du -ch *.txt

If you just want the total space taken to show up, then:

du -ch *.txt | tail -1

回复收藏 0 原文

陌伤ぢ 2024-08-10 09:19:26

这是一种方法（在 Linux 中，使用 GNU coreutils du 和 Bash 语法），避免不良做法：

total=0
while read -r line
do
    size=($line)
    (( total+=size ))
done < <( find . -iname "*.txt" -exec du -b {} + )
echo "$total"

如果要排除当前目录，请将 -mindepth 2 与 find 结合使用。

另一个不需要 Bash 语法的版本：

find . -iname "*.txt" -exec du -b {} + | awk '{total += $1} END {print total}'

请注意，这些版本对于包含换行符的文件名将无法正常工作（但带有空格的文件名可以正常工作）。

Here's a way to do it (in Linux, using GNU coreutils du and Bash syntax), avoiding bad practice:

total=0
while read -r line
do
    size=($line)
    (( total+=size ))
done < <( find . -iname "*.txt" -exec du -b {} + )
echo "$total"

If you want to exclude the current directory, use -mindepth 2 with find.

Another version that doesn't require Bash syntax:

find . -iname "*.txt" -exec du -b {} + | awk '{total += $1} END {print total}'

Note that these won't work properly with file names which include newlines (but those with spaces will work).

回复收藏 0 原文

深居我梦 2024-08-10 09:19:26

macOS

使用工具du和参数-I排除所有其他文件

Linux

-X, --exclude-from=FILE
              exclude files that match any pattern in FILE

--exclude=PATTERN
              exclude files that match PATTERN

macOS

use the tool du and the parameter -I to exclude all other files

Linux

-X, --exclude-from=FILE
              exclude files that match any pattern in FILE

--exclude=PATTERN
              exclude files that match PATTERN

回复收藏 0 原文

暖风昔人 2024-08-10 09:19:26

这将做到这一点：

total=0
for file in *.txt
do
    space=$(ls -l "$file" | awk '{print $5}')
    let total+=space
done
echo $total

This will do it:

total=0
for file in *.txt
do
    space=$(ls -l "$file" | awk '{print $5}')
    let total+=space
done
echo $total

回复收藏 0 原文

做个ˇ局外人 2024-08-10 09:19:26

GNU 发现，

find /home/d -type f -name "*.txt" -printf "%s\n" | awk '{s+=$0}END{print "total: "s" bytes"}'

GNU find,

find /home/d -type f -name "*.txt" -printf "%s\n" | awk '{s+=$0}END{print "total: "s" bytes"}'

回复收藏 0 原文

久光 2024-08-10 09:19:26

在 ennukiller 的基础上，这将处理名称中的空格。我需要这样做并获得一些报告：

find -type f -name "*.wav" | grep 导出 | ./calc_space

#!/bin/bash
# calc_space
echo SPACE USED IN MEGABYTES
echo
total=0
while read FILE
do
    du -m "$FILE"
    space=$(du -m "$FILE"| awk '{print $1}')
    let total+=space
done
echo $total

Building on ennuikiller's, this will handle spaces in names. I needed to do this and get a little report:

find -type f -name "*.wav" | grep export | ./calc_space

#!/bin/bash
# calc_space
echo SPACE USED IN MEGABYTES
echo
total=0
while read FILE
do
    du -m "$FILE"
    space=$(du -m "$FILE"| awk '{print $1}')
    let total+=space
done
echo $total

回复收藏 0 原文

枕花眠 2024-08-10 09:19:26

对于那些在 bash 上使用 GNU 工具的人来说，一则说明：

for i in $(find . -type f | perl -ne 'print $1 if m/\.([^.\/]+)$/' | sort -u); do echo "$i"": ""$(du -hac **/*."$i" | tail -n1 | awk '{print $1;}')"; done | sort -h -k 2 -r

您必须启用 extglob：

shopt -s extglob

如果您希望点文件工作，则必须运行

shopt -s dotglob

示例输出：

d: 3.0G
swp: 1.3G
mp4: 626M
txt: 263M
pdf: 238M
ogv: 115M
i: 76M
pkl: 65M
pptx: 56M
mat: 50M
png: 29M
eps: 25M

等

A one liner for those with GNU tools on bash:

for i in $(find . -type f | perl -ne 'print $1 if m/\.([^.\/]+)$/' | sort -u); do echo "$i"": ""$(du -hac **/*."$i" | tail -n1 | awk '{print $1;}')"; done | sort -h -k 2 -r

You must have extglob enabled:

shopt -s extglob

If you want dot files to work, you must run

shopt -s dotglob

Sample output:

d: 3.0G
swp: 1.3G
mp4: 626M
txt: 263M
pdf: 238M
ogv: 115M
i: 76M
pkl: 65M
pptx: 56M
mat: 50M
png: 29M
eps: 25M

etc

回复收藏 0 原文

乖乖公主 2024-08-10 09:19:26

我的解决方案是获取给定路径和子目录中所有文本文件的总大小（使用 perl oneliner）

find /path -iname '*.txt' | perl -lane '$sum += -s $_; END {print $sum}'

my solution to get a total size of all text files in a given path and subdirectories (using perl oneliner)

find /path -iname '*.txt' | perl -lane '$sum += -s $_; END {print $sum}'

回复收藏 0 原文

原来分手还会想你 2024-08-10 09:19:26

我喜欢将 find 与 xargs 结合使用：

find . -name "*.txt" -print0 |xargs -0 du -ch

如果您只想查看总计，请添加 tail

find . -name "*.txt" -print0 |xargs -0 du -ch | tail -n1

I like to use find in combination with xargs:

find . -name "*.txt" -print0 |xargs -0 du -ch

Add tail if you only want to see the grand total

find . -name "*.txt" -print0 |xargs -0 du -ch | tail -n1

回复收藏 0 原文

剧终人散尽 2024-08-10 09:19:26

对于任何想要在命令行上使用 macOS 执行此操作的人，您需要基于 -print0 参数而不是 printf 的变体。上面的一些答案解决了这个问题，但这将通过扩展来全面地做到这一点：

    find . -type f -print0 | xargs -0 stat -f "%N %i" |
  awk '{
      PARTSCOUNT=split( $1, FILEPARTS, "." );
      EXTENSION=PARTSCOUNT == 1 ? "NULL" : FILEPARTS[PARTSCOUNT];
      FILETYPE_MAP[EXTENSION]+=$2
    }
   END {
     for( FILETYPE in FILETYPE_MAP ) {
       print FILETYPE_MAP[FILETYPE], FILETYPE;
      }
   }' | sort -n

For anyone wanting to do this with macOS at the command line, you need a variation based on the -print0 argument instead of printf. Some of the above answers address that but this will do it comprehensively by extension:

    find . -type f -print0 | xargs -0 stat -f "%N %i" |
  awk '{
      PARTSCOUNT=split( $1, FILEPARTS, "." );
      EXTENSION=PARTSCOUNT == 1 ? "NULL" : FILEPARTS[PARTSCOUNT];
      FILETYPE_MAP[EXTENSION]+=$2
    }
   END {
     for( FILETYPE in FILETYPE_MAP ) {
       print FILETYPE_MAP[FILETYPE], FILETYPE;
      }
   }' | sort -n

回复收藏 0 原文

可爱暴击 2024-08-10 09:19:26

接受的答案有几个潜在的问题：

它不会下降到子目录（不依赖非标准 shell 功能，例如 globstar)
一般来说，正如丹尼斯·威廉姆森在下面指出的那样，您应该避免 globstar) wledge.org/ParsingLs" rel="nofollow noreferrer">解析 ls 的输出
- 即，如果用户或组（第 3 列和第 4 列）中有空格，则第 5 列将不是文件大小
如果您有一百万个这样的文件），这将产生两百万个子shell，并且它会很慢

正如由ghostdog74提议的，您可以使用GNU特定的-printf< /code> 选项来 find 实现更强大的解决方案，避免所有过多的管道、子 shell、Perl 和奇怪的 du 选项：

# the '%s' format string means "the file's size"
find . -name "*.txt" -printf "%s\n" \
  | awk '{sum += $1} END{print sum " bytes"}'

是的，是的，解决方案使用 paste 或 bc 也是可以的，但不是更简单。

在 macOS 上，您需要使用 Homebrew 或 MacPorts 来安装 findutils< /code>，然后调用 gfind。（我看到这个问题上的“linux”标签，但它也被标记为“unix”。）

如果没有GNU find，您仍然可以回退到使用du：

find . -name "*.txt" -exec du -k {} + \
  | awk '{kbytes+=$1} END{print kbytes " Kbytes"}'

……但您必须注意，由于历史原因，du 的默认输出位于 512 字节块中（请参阅手册页的“RATIONALE”部分），以及某些版本du 的（尤其是 macOS 的）甚至没有以字节为单位打印大小的选项。

这里有许多其他好的解决方案（特别是参见 Barn 的答案），但大多数都存在不必要的复杂或过于依赖的缺点仅适用于 GNU 的功能 — 也许在您的环境中，那没关系！

There are several potential problems with the accepted answer:

it does not descend into subdirectories (without relying on non-standard shell features like globstar)
in general, as pointed out by Dennis Williamson below, you should avoid parsing the output of ls
- namely, if the user or group (columns 3 and 4) have spaces in them, column 5 will not be the file size
if you have a million such files, this will spawn two million subshells, and it'll be sloooow

As proposed by ghostdog74, you can use the GNU-specific -printf option to find to achieve a more robust solution, avoiding all the excessive pipes, subshells, Perl, and weird du options:

# the '%s' format string means "the file's size"
find . -name "*.txt" -printf "%s\n" \
  | awk '{sum += $1} END{print sum " bytes"}'

Yes, yes, solutions using paste or bc are also possible, but not any more straightforward.

On macOS, you would need to use Homebrew or MacPorts to install findutils, and call gfind instead. (I see the "linux" tag on this question, but it's also tagged "unix".)

Without GNU find, you can still fall back to using du:

find . -name "*.txt" -exec du -k {} + \
  | awk '{kbytes+=$1} END{print kbytes " Kbytes"}'

…but you have to be mindful of the fact that du's default output is in 512-byte blocks for historical reasons (see the "RATIONALE" section of the man page), and some versions of du (notably, macOS's) will not even have an option to print sizes in bytes.

Many other fine solutions here (see Barn's answer in particular), but most suffer the drawback of being unnecessarily complex or depending too heavily on GNU-only features—and maybe in your environment, that's OK!

回复收藏 0 原文

~没有更多了~