Grep 基于日期的特定元素总数

发布于 2025-01-11 23:37:20 字数 674 浏览 2 评论 0原文

Linux 中有没有一种方法可以在一个命令中过滤包含大量数据的多个文件而无需编写脚本？

对于这个例子，我想知道按日期出现了多少男性。问题还在于，特定日期（1 月 3 日）出现在 2 个单独的文件中：

file1

Jan  1 john male=yes
Jan  1 james male=yes
Jan  2 kate male=no 
Jan  3 jonathan male=yes

file2

Jan  3 alice male=no
Jan  4 john male=yes 
Jan  4 jonathan male=yes
Jan  4 alice male=no

我想要所有文件中每个日期的男性总数。如果特定日期没有男性，则不会给出任何输出。

Jan  1 2 
Jan  3 1
Jan  4 2

我能想到的唯一方法是计算给定特定日期的男性性别总数，但这不会有效，因为在现实世界的示例中可能会有更多文件，并且手动输入所有日期会浪费时间。任何帮助将不胜感激，谢谢！

localhost:~# cat file1 file2 | grep "male=yes" | grep "Jan  1" | wc -l
2

原文

Is there a way in linux to filter multiple files with bunch of data in one command without writing a script?

For this example I want to know how many males appear by date. Also the problem is that a specific date (January 3rd) appears in 2 seperate files:

file1

Jan  1 john male=yes
Jan  1 james male=yes
Jan  2 kate male=no 
Jan  3 jonathan male=yes

file2

Jan  3 alice male=no
Jan  4 john male=yes 
Jan  4 jonathan male=yes
Jan  4 alice male=no

I want the total amount of males for each date from all files. If there are no males for a specific date, no output will be given.

Jan  1 2 
Jan  3 1
Jan  4 2

The only way I can think of is count the total amount of male genders given a specific date, but this would not performant as in real-world examples there could be much more files and manually entering all the dates would be a waste of time. Any help would be appreciated, thank you!

localhost:~# cat file1 file2 | grep "male=yes" | grep "Jan  1" | wc -l
2

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

乖乖公主 2025-01-18 23:37:20

grep -h 'male=yes' file? | \
    cut -c-6 | \
    awk '{c[$0] += 1} END {for(i in c){printf "%6s %4d\n", i, c[i]}}'

grep 将打印男性行，cut 将删除除前 6 个字符（日期）之外的所有内容，awk 将计算每个日期并打印输出最后是日期和计数器。

鉴于您的文件，输出将是：

Jan  1    2
Jan  3    1
Jan  4    2

grep -h 'male=yes' file? | \
    cut -c-6 | \
    awk '{c[$0] += 1} END {for(i in c){printf "%6s %4d\n", i, c[i]}}'

The grep will print the male lines, cut will remove everything but the first 6 chars (date) and awk will count every date and printout every date and the counter in the end.

Given your files the output will be: