尴尬 - 删除最古老的重复线条,保留最新的副本+删除上方删除的一行
我有以下格式的输入:
#1655636921
cd
#1655636926
history
#1655637510
history
#1655637934
ls
#1655637934
ls
#1655638524
cd
#1655638927
ls
#1655638928
history
我想搜索重复项(在行中,不是以'#'开头,或者仅在偶数线上检测重复),请删除所有以前的重复项(保持每个已删除的重复删除一行的最新 +) +,所以输出看起来像这样:
#1655638524
cd
#1655638927
ls
#1655638928
history
我是新来的尴尬,即使保留了最新的重复项,我也找不到任何解决方案,这是我找到的唯一相关的解决方案IS:
awk '!visited[$0]++'
仅删除保留最古老的最新复制品。 非常感谢您的任何帮助。
I have input in a following format:
#1655636921
cd
#1655636926
history
#1655637510
history
#1655637934
ls
#1655637934
ls
#1655638524
cd
#1655638927
ls
#1655638928
history
and I would like to search for duplicates (in lines, that do not start with '#' OR detect duplicates only in even lines), delete all previous duplicates (keeping only the latest one) + for each deleted duplicate delete one previous line, so the output would look like this:
#1655638524
cd
#1655638927
ls
#1655638928
history
I am new to awk and I couldn't find any solution even with preserving latest duplicates, the only related solution that I have found is:
awk '!visited[$0]++'
Which deletes only latest duplicates preserving the oldest one.
Thank you very much in advance for any kind of help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您在系统上没有
tac
命令只是强制性POSIX工具awk
,sort
和剪切
:或者如果您的
cat
具有- n
参数(非posix)或您有nl
(posix但不是强制性):If you don't have the
tac
command on your system you can create atac
function to do the same thing the command does, i.e. reverse the order of input lines, using just the mandatory POSIX toolsawk
,sort
, andcut
:or if your
cat
has a-n
argument (non-POSIX) or you havenl
(POSIX but not mandatory):不知何故,有一个奇怪的副本挥之不去,不得不将其弄清楚蛮力:
=
somehow there was a strange duplicate lingering and had to trim it out brute force :
=
假设:
'line'
的处理,这意味着ls
和ls *.txt
被视为两个不同的命令(即,两者都会显示在最终输出中)'偶行行'
中检测重复项,这意味着我们不必担心嵌套的lineFeeds(在#comment
中,或命令),也不是多行#comment
sOne
awk
消除对任何其他程序的需求的想法:如果OP可以访问
gnu awk(v 4.0+)
(对于procinfo [“ sorted_in”]
支持),我们可以简化此内容:这些都生成:
Assumptions:
'line'
so this meansls
andls *.txt
are to be treated as two distinct commands (ie, both will show up in the final output)'even lines'
which implies we do not need to worry about nested linefeeds (in either the#comment
or the command), nor multi-line#comment
sOne
awk
idea that eliminates the need for any other programs:If OP has access to
GNU awk (v 4.0+)
(forPROCINFO["sorted_in"]
support) we can streamline this a bit:These both generate: