unix：使用第二列合并 2 个文件

发布于 2024-10-16 21:57:51 字数 855 浏览 0 评论 0原文

我想根据第二列的内容合并两个文件。

文件 1：

"4742"  "209220_at"     2.60700394801826
"104"   "209396_s_at"   2.60651442103297
"749"   "202409_at"     2.59424724783704
"4168"  "209875_s_at"   2.58773204877464
"3973"  "1431_at"       2.52832098784342
"1826"  "207201_s_at"   2.41685345240968

文件 2：

"653"   "1431_at"       2.14595534191867
"1109"  "207201_s_at"   2.13777517447307
"353"   "212531_at"     2.12706340284672
"381"   "206535_at"     2.11456707231618
"1846"  "204534_at"     2.10919474441178

最后：

"3973"  "1431_at"       2.52832098784342 "653"   "1431_at"       2.14595534191867
"1826"  "207201_s_at"   2.41685345240968 "1109"  "207201_s_at"   2.13777517447307

我尝试了 comm、diff 和一些晦涩的 awk 单行代码，但没有成功。非常感谢任何帮助。本

原文

I'd like to merge two files according to the content of their 2nd columns.

File 1:

"4742"  "209220_at"     2.60700394801826
"104"   "209396_s_at"   2.60651442103297
"749"   "202409_at"     2.59424724783704
"4168"  "209875_s_at"   2.58773204877464
"3973"  "1431_at"       2.52832098784342
"1826"  "207201_s_at"   2.41685345240968

File2:

"653"   "1431_at"       2.14595534191867
"1109"  "207201_s_at"   2.13777517447307
"353"   "212531_at"     2.12706340284672
"381"   "206535_at"     2.11456707231618
"1846"  "204534_at"     2.10919474441178

To have in the end:

"3973"  "1431_at"       2.52832098784342 "653"   "1431_at"       2.14595534191867
"1826"  "207201_s_at"   2.41685345240968 "1109"  "207201_s_at"   2.13777517447307

I have tried comm, diff, some obscure awk one-liner without any success.
Any help much appreciated.
Ben

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

红颜悴 2024-10-23 21:57:51

您可以结合使用 sort 和 join 命令来完成此操作。最简单的方法是，

join -j2 <(sort -k2 file1) <(sort -k2 file2)

但显示效果与您想要的略有不同。它只显示公共连接字段，然后显示每个文件中的其余字段

"1431_at" "3973" 2.52832098784342 "653" 2.14595534191867
"207201_s_at" "1826" 2.41685345240968 "1109" 2.13777517447307

如果您需要与显示的格式完全相同的格式，那么您需要告诉 join 以这种方式输出，

join -o 1.1,1.2,1.3,2.1,2.2,2.3 -j2 <(sort -k2 file1) <(sort -k2 file2)

其中 -o 接受FILENUM.FIELDNUM 说明符列表。

请注意，我使用的 <() 语法不是 POSIX sh，因此如果您需要 POSIX sh 语法，您应该排序到临时文件。

You can do that with a combination of the sort and join commands. The straightforward approach is

join -j2 <(sort -k2 file1) <(sort -k2 file2)

but that displays slightly differently than you're looking for. It just shows the common join field and then the remaining fields from each file

"1431_at" "3973" 2.52832098784342 "653" 2.14595534191867
"207201_s_at" "1826" 2.41685345240968 "1109" 2.13777517447307

If you need the format exactly as you showed, then you would need to tell join to output in that manner

join -o 1.1,1.2,1.3,2.1,2.2,2.3 -j2 <(sort -k2 file1) <(sort -k2 file2)

where -o accepts a list of FILENUM.FIELDNUM specifiers.

Note that the <() syntax I'm using isn't POSIX sh, so you should sort to a temporary file if you need POSIX sh syntax.

回复收藏 0 原文

于我来说 2024-10-23 21:57:51

awk '
  # store the first file, indexed by col2
  NR==FNR {f1[$2] = $0; next}
  # output only if file1 contains file2's col2
  ($2 in f1) {print f1[$2], $0}
' file1 file2

awk '
  # store the first file, indexed by col2
  NR==FNR {f1[$2] = $0; next}
  # output only if file1 contains file2's col2
  ($2 in f1) {print f1[$2], $0}
' file1 file2

回复收藏 0 原文

深白境迁sunset 2024-10-23 21:57:51

如果文件很小，请用脚本语言（Perl、Python 和 Ruby 都是不错的选择）编写一个程序，将第一个文件读入一个散列，其键是第二列，然后读取第二个文件并使用散列查找来查找可以加入什么（如果有的话）。

如果文件很大，则每个文件交换第一列和第二列，将它们传递给 unix 排序实用程序，然后用脚本语言合并（加上列重新排序）两个排序的文件。

回复收藏 0 原文

谁人与我共长歌 2024-10-23 21:57:51

awk 'FNR==NR{a[$2]=$0} NR>FNR && ($2 in a){ print $0,a[$2] } ' file2 file1

awk 'FNR==NR{a[$2]=$0} NR>FNR && ($2 in a){ print $0,a[$2] } ' file2 file1

回复收藏 0 原文

~没有更多了~

关于作者

邮友

暂无简介

0 文章

0 评论

22 人气

关注发私信

友情链接

文江博客

unix：使用第二列合并 2 个文件

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

留蓝

18790681156

zach7772

Wini

ayeshaaroy

初雪

友情链接

unix：使用第二列合并 2 个文件

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

留蓝

18790681156

zach7772

Wini

ayeshaaroy

初雪

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。