Excel 比较两个 csv 文件并显示差异
我正在比较两组大的 csv 文件和/或 csv 文件和 .txt 文件。我“认为”.txt 文件可能需要转换为 csv 文件只是为了简单起见,但这可能需要也可能不需要。我想使用 excel、c++ 或 python。我需要将一个“接受的”值列表与测量的列表进行比较,并找出它们之间的差异(如果有)。 Excel 可能是最简单的方法,但 Python 或 C++ 也可能同样有效。这不是家庭作业,所以不用担心这类事情。非常感谢代码建议和/或模板。或网站链接
编辑1
我已经阅读过有关Python的difflib或不同类的信息,但不熟悉如何使用它,并且可能比我想要的更多。
编辑2
这两个文件都有一系列列(它们之间没有画线或任何东西),并且在这些“命名”列下面会有数字。我需要将文件 1 中第 1 列第 1 列中的数字与文件 2 中第 1 列第 1 列中的数字进行比较,如果存在差异,请显示另一个 csv 文件中的差异
I'm looking to compare two big sets of csv files and/or a csv file and a .txt file. I "think" the .txt file may need to be converted to a csv file just for simplicity sake but that may or may not be needed. I either want to use excel, c++, or python. I need to compare one "accepted" value list to a list that is measured and find the difference between them if there is one. Excel may be the easiest way to do this but python or c++ may work just as well. This is not homework so don't worry about that sort of thing. Code advice and/or templates is greatly appreciated. or links to websites
EDIT 1
I've read about Python's difflib or differ class but unfamiliar how to use it and may be more than I want.
EDIT 2
The Files both will have a series of columns(not with lines drawn between them or anything) and below those "named" columns there will be numbers. I need to compare the number in column 1 spot one in file one to column 1 spot one of file 2 and if there is a difference show the difference in another csv file
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以使用 ADO(ODBC/JET/OLEDB 文本驱动程序)将“合适的”.txt/.csv/.tab/.flr 文件视为 SQL 数据库中来自每种支持 COM 的语言的表。然后可以使用 SQL 的强大功能来完成比较(DISTINCT、GROUP、(LEFT)JOINS,...)。
添加关于您的评论:
这是您的问题,我不想把您推到您不想去的地方。但如果您需要比较表格数据,SQL 是一个很好的(最好的?)工具。作为发现两个 .txt 文件中差异的脚本输出的证据:
进一步添加:
此 文章涉及 ADO 和文本文件;查找文件 adoNNN.chm
(NNN=版本号,例如210)在您的计算机上;这是一本好书
阿多。
您可以使用 Access 或 OpenOffice Base 来试验 SQL 语句
应用于链接/引用(未导入!)文本数据库。
掌握了最初的障碍后,脚本/程序就会很容易:连接
到数据库,即包含文件和 schema.ini 的文件夹
file 来定义 files=tables 的结构。
上面的输出是通过以下方式生成的:
如果删除/忽略脂肪(创建 SQL 语句、诊断输出),它会沸腾
减少至 6 行
,可以轻松“移植”到每种支持 COM 的语言,因为 ADO
物体承担了所有繁重的工作。当您
想要保存结果集:只需旋转分隔符/分隔符/空参数
并将其转储到文件
(不要忘记将该表的定义添加到您的 schema.ini 中)。的
当然你也可以使用“SELECT/INSERT INTO”,但这样的语句可能不会
很容易正确/通过 ADO 文本驱动程序的解析器。
添加计算:
从 5 x 2 主/批准文件开始,其中包含:
通过附加 Spot 列将其转换为预期.txt
,使其符合
schema.ini 文件中的要求。类似地,转换一个测量文件,如:
到measured.txt
Apply
将结果集写入differences.txt
aFNames = Array( "Num0", ... "Spot" )
oFS.CreateTextFile( sFSpec ).Write _
Join( aFNames, sFSep ) & sRSep& oRS.GetString( adClipString, , sFSep, sRSep, "" )
你会得到:
You can use ADO (ODBC/JET/OLEDB Text Driver) to treat 'decent' .txt/.csv/.tab/.flr files as tables in a SQL Database from every COM-enabled language. Then the comparisons could be done using the power of SQL (DISTINCT, GROUP, (LEFT) JOINS, ...).
Added with regard to your comment:
It's your problem and I don't want to push you where you don't want to go. But SQL is a good (the best?) tool, if you need to compare tabular data. As evidence the output of a script that spots the differences in two .txt files:
Further additions:
This article deals with ADO and text files; look for a file adoNNN.chm
(NNN=Version number, e.g. 210) on your computer; this is a good book about
ADO.
You can use Access or OpenOffice Base to experiment with SQL statements
applied to a linked/referenced (not imported!) text database.
A script/program will be easy after you mastered the initial hurdle: connecting
to the the database, i.e. to a folder containing the files and a schema.ini
file to define the structure of the files=tables.
The output above was generated by:
If you delete/ignore the fat (create SQL statements, diagnostics output), it boils
down to 6 lines
which can be 'ported' easily to every COM-enabled language, because the ADO
objects do all the heavy lifting. The .GetString method comes handy, when you
want to save a resultset: just twiddle the separator/delimiter/Null arguments
and dump it to file
(don't forget to add a definition for that table to your schema.ini). Of
course you also can use a "SELECT/INSERT INTO", but such statements may not
be easy to get right/passed the ADO Text Driver's parser.
Addition wrt Computations:
Start with a 5 x 2 master/approved file containing:
transform it to expected.txt
by appending the Spot column so it conforms to
in your schema.ini file. Similarly, transform a measure file like:
to measured.txt
Apply
Write the resultset to differences.txt
aFNames = Array( "Num0", ... "Spot" )
oFS.CreateTextFile( sFSpec ).Write _
Join( aFNames, sFSep ) & sRSep & oRS.GetString( adClipString, , sFSep, sRSep, "" )
and you get:
您不需要编码,您可以使用文本编辑器中的替换功能使两个文件中的分隔符相同(空格或逗号),并使用 TortoiseSVN 的图形差异工具进行比较: http://tortoisesvn.net/
You don't need to code, you can make separators the same in both files (spaces or commas) using replace function in text editor and compare it using graphical diff tool from TortoiseSVN: http://tortoisesvn.net/