从字典中提取数据
我有两个制表符分隔的文件,文件 1 包含标识符,文件 2 包含与这些标识符相关的值(或者说它是一个非常大的字典)。
文件 1
Ronny Rubby Suzie Paul
文件 1 只有一列。
文件 2
Alistar Barm Cathy Paul Ronny Rubby Suzie Tom Uma Vai Zai 12 13 14 12 11 11 12 23 30 0.34 0.65 1 4 56 23 12 8.9 5.1 1 4 25 3
文件 2 中存在 n 行。
我想要的是,如果文件 1 的标识符存在于文件 2 中,我应该将与其相关的所有值放在另一个制表符分隔的文件中。
像这样:
Paul Ronny Rubby Suzie 12 11 11 12 23 12 8.9 5.1
提前谢谢您。
I have two tab delimited files, file 1 contains identifiers and file 2 has values related to these identifiers (or say it is a very big dictionary).
file 1
Ronny Rubby Suzie Paul
file 1 has only one column.
file 2
Alistar Barm Cathy Paul Ronny Rubby Suzie Tom Uma Vai Zai 12 13 14 12 11 11 12 23 30 0.34 0.65 1 4 56 23 12 8.9 5.1 1 4 25 3
n number of rows are present in file 2.
what I want, if the identifiers of file 1 are present in file 2, I should have all the values related to it in an another tab delimited file.
Something like this:
Paul Ronny Rubby Suzie 12 11 11 12 23 12 8.9 5.1
Thank you in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
注意
您的示例输出不正确,因为您有“Ruby”,但在 file1 示例中您有“Ruby” Ruby =/= Rubby
输出
NOTE
your example output is NOT correct, since there you have "Ruby" but in your file1 example you had "Rubby" Ruby =/= Rubby
output
您只能使用 bash 来完成此操作:
You can use only bash to do it:
分成多行 &&添加空格
break into multi lines && add whitespace
Python 中在流中执行工作的示例(即:在开始输出之前不需要加载完整文件):
输出:
An example in Python that does the work in stream (ie: don't need to load the full file before starting the output):
Output:
Perl解决方案:
Perl solution:
像这样的东西可能会起作用,具体取决于你想要什么。
Something like this could probably work, depending on what you want.
这可能对您有用:
说明:
,
分隔的列号。cut
命令。cut
命令。This might work for you:
Explanation:
,
's.cut
command from the comma separated column number list.cut
command against the data file.