使用 python 检查两个文件的新行独立标识的最佳方法
我尝试过
filecmp.cmp(file1,file2)
,但它不起作用,因为除了换行符之外,文件是相同的。 filecmp 或其他一些方便的函数/库中是否有一个选项,或者我是否必须逐行读取这两个文件并进行比较?
I tried
filecmp.cmp(file1,file2)
but it doesn't work since files are identically except for new line characters. Is there an option for that in filecmp or some other convenience function/library or do I have to read both files line by line and compare those?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我认为像这样的简单便利功能应该可以完成这项工作:
I think a simple convenience function like this should do the job:
尝试
difflib
模块 - 它提供用于比较序列的类和函数。根据您的需要,
difflib.Differ
课程看起来很有趣。请参阅比较两个文本的差异示例。被比较的序列也可以从类文件对象的 readlines() 方法获得。
Try the
difflib
module - it provides classes and functions for comparing sequences.For your needs, the
difflib.Differ
class looks interesting.See the differ example, that compares two texts. The sequences being compared can also be obtained from the
readlines()
method of file-like objects.filecmp.cmp() 的源代码具有以下用于比较部分:
我将其修改为:
对于以读取模式打开的 Python 3,会自动为您转换换行符。对于旧版本,您可以将“U”添加到模式中。我在我正在开发的一个包的测试台上测试了这段代码,它似乎有效。
The source code for filecmp.cmp() has this for the comparison part:
I modified that to make:
For Python 3 opening in read mode automatically converts newlines for you. For older versions you can add 'U' to the mode. I tested this code in a test bench for a package I am working on and it seems to work.
看起来您只需要检查文件是否相同或不忽略空格/换行符。
您可以使用这样的函数,
您可以改进
is_same
,以便它根据您的要求进行匹配,例如您也可以忽略大小写。Looks like you just need to check if files are same or not ignoring whitespace/newlines.
You can use a function like this
you can improve
is_same
so that it matches according to your requirements e.g. you may ignore case too.