BASH - 加入非第一列
我正在尝试将 2 个文件连接在一起 - 两个文件都是 CSV 格式 - 两个文件具有相同的列。以下是每个文件的示例:
文件 1:
CustName,AccountReference,InvoiceDate,InvoiceRefID,TelNo,Rental,GPRS,Mnet,MnetPlus,SMS,CSD,IntRoaming,NetAmount
acme,107309 ,2011-09-24 12:47:11.000,AP/157371,07741992165 ,2.3900,.0000,.0000,.0000,.0000,.0000,.0000,2.3900
acme,107309 ,2011-09-24 12:58:32.000,AP/162874,07740992165 ,2.0000,.0000,.0000,.0000,.0000,.0000,.0000,2.0000
anot,107308 ,2011-09-24 12:58:32.000,AP/162874,07824912428 ,2.0000,.0000,.0000,.0000,.0000,.0000,.0000,2.0000
anot,107308 ,2011-09-24 12:47:11.000,AP/157371,07834919928 ,1.5500,.0000,.0000,.0000,.0000,.0000,.0000,1.5500
文件 2:
CustName,AccountReference,InvoiceDate,InvoiceRefID,TelNo,Rental,GPRS,Mnet,MnetPlus,SMS,CSD,IntRoaming,NetAmount
acme,100046,2011-10-05 08:29:19,AB/020152,07824352342,12.77,0.00,0.00,0.00,0.00,0.00,0.00,12.77
anbe,100046,2011-10-05 08:29:19,AB/020152,07741992165,2.50,0.00,0.00,0.00,0.00,0.00,0.00,2.50
acve,100046,2011-10-05 08:29:19,AB/020152,07740992165,10.00,0.00,0.00,0.00,0.00,0.00,0.00,10.00
asce,100046,2011-10-05 08:29:19,AB/020152,07771335702,2.50,0.00,0.00,0.00,0.00,0.00,0.00,2.50
我想将这两个文件连接在一起 - 但只取其中一些列,其他列可以忽略(有些相同,有些不同) -
AccountRef,telno,rental_file1,rental_file2,gprs_file1,gprs_file2
等等....
连接应该在 telno
列上完成(看起来我有空格)文件 1 - 希望可以忽略吗?
我找到了很多使用 JOIN
的示例,但它们都使用第一列作为连接上的键......任何指针都会很棒 - 谢谢
I am trying to join 2 files together - both files are in CSV format - both files have the same columns. Here is an example of each file :
File 1:
CustName,AccountReference,InvoiceDate,InvoiceRefID,TelNo,Rental,GPRS,Mnet,MnetPlus,SMS,CSD,IntRoaming,NetAmount
acme,107309 ,2011-09-24 12:47:11.000,AP/157371,07741992165 ,2.3900,.0000,.0000,.0000,.0000,.0000,.0000,2.3900
acme,107309 ,2011-09-24 12:58:32.000,AP/162874,07740992165 ,2.0000,.0000,.0000,.0000,.0000,.0000,.0000,2.0000
anot,107308 ,2011-09-24 12:58:32.000,AP/162874,07824912428 ,2.0000,.0000,.0000,.0000,.0000,.0000,.0000,2.0000
anot,107308 ,2011-09-24 12:47:11.000,AP/157371,07834919928 ,1.5500,.0000,.0000,.0000,.0000,.0000,.0000,1.5500
File 2:
CustName,AccountReference,InvoiceDate,InvoiceRefID,TelNo,Rental,GPRS,Mnet,MnetPlus,SMS,CSD,IntRoaming,NetAmount
acme,100046,2011-10-05 08:29:19,AB/020152,07824352342,12.77,0.00,0.00,0.00,0.00,0.00,0.00,12.77
anbe,100046,2011-10-05 08:29:19,AB/020152,07741992165,2.50,0.00,0.00,0.00,0.00,0.00,0.00,2.50
acve,100046,2011-10-05 08:29:19,AB/020152,07740992165,10.00,0.00,0.00,0.00,0.00,0.00,0.00,10.00
asce,100046,2011-10-05 08:29:19,AB/020152,07771335702,2.50,0.00,0.00,0.00,0.00,0.00,0.00,2.50
I would like to join the 2 files together - but just taking some of the columns the other columns can be ignored (some are the same, some are different) -
AccountRef,telno, rental_file1,rental_file2,gprs_file1,gprs_file2
etc etc ....
The join should be done on the telno
column (it seems I have white space in file 1 - hope that can be ignored ?
i have found lots of examples using JOIN
but all of them use the first column for the key on the join .... any pointers would be great - thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
基本答案是:
这会将
file
第 3 列上的文件file1
和file2
与file2
第 4 列连接起来>,然后在第 6 列和第 2 列上。当然,数据文件必须在相同的列上排序。-t ,
设置 CSV 的分隔符 - 但join
不会处理带引号的字符串中嵌入的逗号。如果您的数据很简单(没有带引号的字符串),那么您也可以使用
awk
。如果您的数据包含可能包含逗号等的引号字符串,那么您需要一个支持 CSV 的工具。我可能会将 Perl 与 Text::CSV 模块(以及 < href="http://search.cpan.org/perldoc?Text%3a%3aCSV_XS" rel="nofollow">Text::CSV_XS 模块以提高性能)。The basic answer is:
This will join the files
file1
andfile2
on column 3 fromfile
with column 4 fromfile2
, then on columns 6 and 2. The data files must be sorted on those same columns, of course. The-t ,
sets the separator for CSV - butjoin
will not handle embedded commas inside quoted strings.If your data is simple (no quoted strings) then you can also use
awk
. If your data has quoted strings which may contain commas, etc, then you need a CSV-aware tool. I'd probably use Perl with the Text::CSV module (and the Text::CSV_XS module for performance).看看
cat
和cut
:-)例如
产量
Have a look at
cat
andcut
:-)For instance
yields
所有 GNU 实用程序均记录于此:
http://www.gnu。 org/s/coreutils/manual/html_node/index.html#Top
对于您的问题,请参阅 cat、cut、sort、uniq 和 join。
All the GNU utilities documented here:
http://www.gnu.org/s/coreutils/manual/html_node/index.html#Top
For your problem, see cat, cut, sort, uniq and join.