如何删除多个文件中的重复行？

发布于 2024-10-12 19:40:55 字数 247 浏览 1 评论 0原文

我在不同的目录和子目录中有 7 个具有以下名称的文件，

tag0.txt, tag1.txt, tag2.txt, tag3.txt, tag01.txt, tag02.txt and tag03.txt

其中一些文件具有重复的行。如何删除重复行？请注意，每个文件中的行未排序，每个文件的长度范围为 500 到 1000 行。

任何帮助将不胜感激。

谢谢

原文

I have 7 files with the following names in different directories and subdirectories

tag0.txt, tag1.txt, tag2.txt, tag3.txt, tag01.txt, tag02.txt and tag03.txt

Some of these files have duplicated rows. How can I delete the duplicated rows? Note that the rows in each file are not sorted and the length of each file range from 500 to 1000 rows.

Any help would be much appreciated.

Thank you

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

倾其所爱 2024-10-19 19:40:55

假设您想在每个文件的基础上删除重复项，以下内容不需要排序文件，因此不会扰乱行的顺序：

awk '!a[$0]++' infile > outfile

因为您的文件似乎位于不同的目录中最简单的方法可能是手动运行该命令 7 次。如果你真的想要，你可以像这样循环它：

#!/bin/sh

for file in /path/to/file1 /path/to/file2 ... /path/to/file7; do
    awk '!a[$0]++' "$file" > "$file".new && \
    mv "$file".new "$file"
done

Assuming you want to remove dupes on a per-file basis, the following doesn't require sorted files and thus doesn't mess with the order of the lines:

awk '!a[$0]++' infile > outfile

Since your files seem to be in different directories it's probably easiest to just run that command manually 7 times. If you really want to though you can loop it like this:

#!/bin/sh

for file in /path/to/file1 /path/to/file2 ... /path/to/file7; do
    awk '!a[$0]++' "$file" > "$file".new && \
    mv "$file".new "$file"
done

回复收藏 0 原文

只为守护你 2024-10-19 19:40:55

使用 sort 和 uniq 命令，它们是 unix 实用程序

cat "your files" | sort | uniq

use sort and uniq command which are unix utilities

cat "your files" | sort | uniq

回复收藏 0 原文

忘东忘西忘不掉你 2024-10-19 19:40:55

注意，确实“直接”更改文件（就地编辑）。

perl -i -ne 'print if not $seen{$ARGV}{$_}++' file1 file2 file3 ...

Attention, does change the file "directly" ( in-place edit ).

perl -i -ne 'print if not $seen{$ARGV}{$_}++' file1 file2 file3 ...

回复收藏 0 原文

溇涏 2024-10-19 19:40:55

bash 4.0++

shopt -s globstar
for file in **/tag*.txt
do
    sort $file|uniq > t && mv t $file
done

bash 4.0++

shopt -s globstar
for file in **/tag*.txt
do
    sort $file|uniq > t && mv t $file
done

回复收藏 0 原文

~没有更多了~

关于作者

清晰传感

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

如何删除多个文件中的重复行？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

留蓝

18790681156

zach7772

Wini

ayeshaaroy

初雪

友情链接

如何删除多个文件中的重复行？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

留蓝

18790681156

zach7772

Wini

ayeshaaroy

初雪

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。