匹配两个文件的列元素并使用 AWK/PERL 将其替换为匹配的行

发布于 01-15 12:19 字数 833 浏览 1 评论 0原文

我有两个文件，每个文件有 3 列。我想将 file1 的第 3 列的元素与 file2 的第 3 列进行匹配。如果匹配，则将 file1 的整行替换为 file2 中与匹配项对应的行，否则移至下一行。

示例如下：在 file2 中，第 3 列元素 reg[2] 和 reg[9][9] 出现在 file1 的第 3 列中。因此，file1 的相应行被 file2 的行替换。

文件1：

Nancy Owen reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Nancy Owen reg[9][9]
Nancy Owen reg[54]

文件2：

Done Approval reg[9][9]
Nancy Owen reg[10_8]
Nancy Owen reg[4][10]
Done Approval reg[2]

所需的输出

Done Approval reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Done Approval reg[9][9]
Nancy Owen reg[54]

尝试的代码：

awk -F, 'NR==FNR{a[$3]=$0;next;}a[$3]{$0=a[$3]}1' file2 file1

我仍然是使用oneliner awk命令的新手。我在上面的代码中肯定做错了什么。我想做的是将第三列以键的形式放置，将整行作为值。如果该键存在于 file1 的第 3 列中，则将 fil1 当前行替换为 file2 中的当前值。否则跳过并移至下一行。

原文

I have two files each having 3 columns. I want to match element of column 3 of file1 with column3 of file2. If it matches, replace the entire line of file1 with the line corresponding to the match from file2, otherwise move to the next line.

Below is the example: In file2, the 3rd column element reg[2] and reg[9][9] are present in the column 3 of file1. So, the corresponding line of file1 is replaced with that of line from file2.

File1:

Nancy Owen reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Nancy Owen reg[9][9]
Nancy Owen reg[54]

File2:

Done Approval reg[9][9]
Nancy Owen reg[10_8]
Nancy Owen reg[4][10]
Done Approval reg[2]

Desired output

Done Approval reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Done Approval reg[9][9]
Nancy Owen reg[54]

Attempted code:

awk -F, 'NR==FNR{a[$3]=$0;next;}a[$3]{$0=a[$3]}1' file2 file1

I am still a novice in using oneliner awk commands. I am definitely doing something wrong in the above code. What I am trying to do is put the 3rd column in the form of key and the entire line as value. If the key exists in column3 of file1, replace fil1 current line with current value from file2. Otherwise skip and move to next line.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

别闹i2025-01-22 12:19:40

我将使用 GNU AWK 以下方式，让 file1.txt 内容为

Nancy Owen reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Nancy Owen reg[9][9]
Nancy Owen reg[54]

file2.txt 内容，

Done Approval reg[9][9]
Nancy Owen reg[10_8]
Nancy Owen reg[4][10]
Done Approval reg[2]

然后

awk 'FNR==NR{arr[$3]=$0;next}{print($3 in arr?arr[$3]:$0)}' file2.txt file1.txt

输出

Done Approval reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Done Approval reg[9][9]
Nancy Owen reg[54]

说明：从处理 file2.txt 通过将每一行存储在数组 arr 中，键为第三列 ($3) 值，不执行任何其他操作（因此 next用法），然后处理file1.txt 如果 arr 键中存在第三个值（$3 in arr）则 print 相应的值，否则 print当前行（$0）。为此，我采用了所谓的三元运算符条件?valueiftrue:valueiffalse

（在 GNU Awk 5.0.1 中测试）

I would use GNU AWK following way, let file1.txt content be

Nancy Owen reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Nancy Owen reg[9][9]
Nancy Owen reg[54]

and file2.txt content be

Done Approval reg[9][9]
Nancy Owen reg[10_8]
Nancy Owen reg[4][10]
Done Approval reg[2]

then

awk 'FNR==NR{arr[$3]=$0;next}{print($3 in arr?arr[$3]:$0)}' file2.txt file1.txt

output

Done Approval reg[2]
Nancy Owen reg[4_8]
Nancy Owen reg[7]
Done Approval reg[9][9]
Nancy Owen reg[54]

Explanation: start from processing file2.txt by storing each line in array arr under key being 3rd column ($3) value, do nothing else (thus next usage), then process file1.txt if 3rd value is present among arr keys ($3 in arr) do print corresponding value otherwise print current line ($0). In order to do so I employ so-called ternary operator condition?valueiftrue:valueiffalse

(tested in GNU Awk 5.0.1)

回复收藏 0 原文

南薇2025-01-22 12:19:40

注意 perl 标签，这是一个 Perl 解决方案：

perl -ane 'if ($eof) {
               if (exists $h{ $F[2] }) {
                   print $h{ $F[2] }
               } else { print }
           } else {
               $h{ $F[2] } = $_;
               $eof = 1 if eof;
           }' -- file2 file1

-n 逐行读取输入，运行每行的代码；
-a 将空白处的每一行分割到 @F 数组中；
我们在第一个文件（即file2）的末尾设置变量$eof；
在读取第一个文件（file2）时，我们将每一行存储到由第三列作为键控的哈希中；
在读取第二个文件 (file1) 时，我们检查哈希是否包含第三列的行：如果是，则打印它，否则打印当前行。

Noticing the perl tag, here's a Perl solution:

perl -ane 'if ($eof) {
               if (exists $h{ $F[2] }) {
                   print $h{ $F[2] }
               } else { print }
           } else {
               $h{ $F[2] } = $_;
               $eof = 1 if eof;
           }' -- file2 file1

-n reads the input line by line, running the code for each line;
-a splits each line on whitespace into the @F array;
We set the variable $eof at the end of the first file, i.e. file2;
While reading the first file (file2), we store each line to a hash keyed by the third column;
While reading the second file (file1), we check whether the hash contains the line for the third column: if so, we print it, otherwise we print the current line.

回复收藏 0 原文

~没有更多了~