比较 csv 文件

发布于 2024-08-18 22:28:40 字数 464 浏览 1 评论 0原文

我想编写一个 shell 脚本来比较两个 .csv 文件。第一个包含文件名、路径,第二个 .csv 文件包含文件名、路径、目标。现在,我想比较两个 .csv 文件并输出第一个 .csv 文件存在于第二个 .csv 文件中的目标名称。

前任。

a.csv

build.xml,/home/build/NUOP/project1  
eesX.java,/home/build/adm/acl

b.csv

build.xml,/home/build/NUOP/project1,M1
eesX.java,/home/build/adm/acl,M2
ddexse3.htm,/home/class/adm/33eFg

我希望输出是这样的。

M1和M2

请帮忙 谢谢,

I want to write a shell script to compare two .csv files. First one contains filename,path the second .csv file contains filename,paht,target. Now, I want to compare the two .csv files and output the target name where the file from the first .csv exists in the second .csv file.

Ex.

a.csv

build.xml,/home/build/NUOP/project1  
eesX.java,/home/build/adm/acl

b.csv

build.xml,/home/build/NUOP/project1,M1
eesX.java,/home/build/adm/acl,M2
ddexse3.htm,/home/class/adm/33eFg

I want the output to be something like this.

M1 and M2

Please help
Thanks,

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

三五鸿雁 2024-08-25 22:28:40

如果您不一定需要 shell 脚本,您可以像这样在 Python 中轻松完成:

import csv

seen = set()

for row in csv.reader(open('a.csv')):
  seen.add(tuple(row))

for row in csv.reader(open('b.csv')):
  if tuple(row[:2]) in seen:
    print row[2]

If you don't necessarily need a shell script, you can easily do it in Python like this:

import csv

seen = set()

for row in csv.reader(open('a.csv')):
  seen.add(tuple(row))

for row in csv.reader(open('b.csv')):
  if tuple(row[:2]) in seen:
    print row[2]
烛影斜 2024-08-25 22:28:40

如果那些 M1 和 M2 始终位于字段 3 和 5,您可以尝试此

awk -F"," 'FNR==NR{
    split($3,b," ")
    split($5,c," ")
    a[$1]=b[1]" "c[1]
    next
}
($1 in a){
    print "found: " $1" "a[$1]
}' file2.txt file1.txt

输出

# cat file2.txt
build.xml,/home/build/NUOP/project1,M1 eesX.java,/home/build/adm/acl,M2 ddexse3.htm,/home/class/adm/33eFg
filename, blah,M1 blah, blah, M2 blah , end

$ cat file1.txt
build.xml,/home/build/NUOP/project1 eesX.java,/home/build/adm/acl

$ ./shell.sh
found: build.xml M1 M2

if those M1 and M2 are always at field 3 and 5, you can try this

awk -F"," 'FNR==NR{
    split($3,b," ")
    split($5,c," ")
    a[$1]=b[1]" "c[1]
    next
}
($1 in a){
    print "found: " $1" "a[$1]
}' file2.txt file1.txt

output

# cat file2.txt
build.xml,/home/build/NUOP/project1,M1 eesX.java,/home/build/adm/acl,M2 ddexse3.htm,/home/class/adm/33eFg
filename, blah,M1 blah, blah, M2 blah , end

$ cat file1.txt
build.xml,/home/build/NUOP/project1 eesX.java,/home/build/adm/acl

$ ./shell.sh
found: build.xml M1 M2
一身骄傲 2024-08-25 22:28:40

尝试 http://sourceforge.net/projects/csvdiff/

引用:
csvdiff 是一个 Perl 脚本,用于比较/比较两个 csv 文件,并可以选择分隔符。差异将显示为:“记录 999 中的 XYZ 列”不同。此后,将显示该列的实际结果和预期结果。

try http://sourceforge.net/projects/csvdiff/

Quote:
csvdiff is a Perl script to diff/compare two csv files with the possibility to select the separator. Differences will be shown like: "Column XYZ in record 999" is different. After this, the actual and the expected result for this column will be shown.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文