在 Python 中使用 difflib 比较两个 .txt 文件

发布于 2024-07-25 02:51:56 字数 240 浏览 10 评论 0原文

我正在尝试比较两个文本文件并输出比较文件中不匹配的第一个字符串,但由于我对 python 非常陌生,所以遇到了困难。 任何人都可以给我一个使用该模块的示例方法吗?

当我尝试类似的操作时:

result = difflib.SequenceMatcher(None, testFile, comparisonFile)

我收到一条错误消息,指出“文件”类型的对象没有 len。

I am trying to compare two text files and output the first string in the comparison file that does not match but am having difficulty since I am very new to python. Can anybody please give me a sample way to use this module.

When I try something like:

result = difflib.SequenceMatcher(None, testFile, comparisonFile)

I get an error saying object of type 'file' has no len.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

相守太难 2024-08-01 02:51:56

对于初学者,您需要将字符串传递给 difflib.SequenceMatcher,而不是文件:

# Like so
difflib.SequenceMatcher(None, str1, str2)

# Or just read the files in
difflib.SequenceMatcher(None, file1.read(), file2.read())

这将修复您的错误。

要获取第一个不匹配的字符串,请参阅 difflib 文档。

For starters, you need to pass strings to difflib.SequenceMatcher, not files:

# Like so
difflib.SequenceMatcher(None, str1, str2)

# Or just read the files in
difflib.SequenceMatcher(None, file1.read(), file2.read())

That'll fix your error.

To get the first non-matching string, see the difflib documentation.

在梵高的星空下 2024-08-01 02:51:56

这是使用 Python difflib 比较两个文件内容的快速示例...

import difflib

file1 = "myFile1.txt"
file2 = "myFile2.txt"

diff = difflib.ndiff(open(file1).readlines(),open(file2).readlines())
print ''.join(diff),

Here is a quick example of comparing the contents of two files using Python difflib...

import difflib

file1 = "myFile1.txt"
file2 = "myFile2.txt"

diff = difflib.ndiff(open(file1).readlines(),open(file2).readlines())
print ''.join(diff),
写给空气的情书 2024-08-01 02:51:56

您确定这两个文件都存在吗?

刚刚测试了一下,我得到了完美的结果。

为了获得结果,我使用类似的方法:

import difflib

diff=difflib.ndiff(open(testFile).readlines(), open(comparisonFile).readlines())

try:
    while 1:
        print diff.next(),
except:
    pass

每行的第一个字符指示它们是否不同:
例如:“+”表示已添加以下行等。

Are you sure both files exist ?

Just tested it and i get a perfect result.

To get the results i use something like:

import difflib

diff=difflib.ndiff(open(testFile).readlines(), open(comparisonFile).readlines())

try:
    while 1:
        print diff.next(),
except:
    pass

the first character of each line indicates if they are different:
eg.: '+' means the following line has been added, etc.

并安 2024-08-01 02:51:56

听起来您可能根本不需要 difflib。 如果您要逐行比较,请尝试以下操作:

test_lines = open("test.txt").readlines()
correct_lines = open("correct.txt").readlines()

for test, correct in zip(test_lines, correct_lines):
    if test != correct:
        print "Oh no! Expected %r; got %r." % (correct, test)
        break
else:
    len_diff = len(test_lines) - len(correct_lines)
    if len_diff > 0:
        print "Test file had too much data."
    elif len_diff < 0:
        print "Test file had too little data."
    else:
        print "Everything was correct!"

It sounds like you may not need difflib at all. If you're comparing line by line, try something like this:

test_lines = open("test.txt").readlines()
correct_lines = open("correct.txt").readlines()

for test, correct in zip(test_lines, correct_lines):
    if test != correct:
        print "Oh no! Expected %r; got %r." % (correct, test)
        break
else:
    len_diff = len(test_lines) - len(correct_lines)
    if len_diff > 0:
        print "Test file had too much data."
    elif len_diff < 0:
        print "Test file had too little data."
    else:
        print "Everything was correct!"
阳光①夏 2024-08-01 02:51:56

另一种更简单的方法来检查两个文本文件是否逐行相同。 试试看。

fname1 = 'text1.txt'
fname2 = 'text2.txt'

f1 = open(fname1)
f2 = open(fname2)

lines1 = f1.readlines()
lines2 = f2.readlines()
i = 0
f1.seek(0)
f2.seek(0)
for line1 in f1:
    if lines1[i] != lines2[i]:
        print(lines1[i])
        exit(0)
    i = i+1

print("both are equal")

f1.close()
f2.close()

否则,你可以使用 python 中的 filecmp 预定义文件。

import filecmp

fname1 = 'text1.txt'
fname2 = 'text2.txt'

print(filecmp.cmp(fname1, fname2))

:)

Another easier method to check whether two text files are same line by line. Try it out.

fname1 = 'text1.txt'
fname2 = 'text2.txt'

f1 = open(fname1)
f2 = open(fname2)

lines1 = f1.readlines()
lines2 = f2.readlines()
i = 0
f1.seek(0)
f2.seek(0)
for line1 in f1:
    if lines1[i] != lines2[i]:
        print(lines1[i])
        exit(0)
    i = i+1

print("both are equal")

f1.close()
f2.close()

otherwise, there is a predefined file in python in filecmp which you can use.

import filecmp

fname1 = 'text1.txt'
fname2 = 'text2.txt'

print(filecmp.cmp(fname1, fname2))

:)

给不了的爱 2024-08-01 02:51:56
# -*- coding: utf-8 -*-
"""
   

"""

def compare_lines_in_files(file1_path, file2_path):
    try:
        with open(file1_path, 'r', encoding='utf-8') as file1, open(file2_path, 'r', encoding='utf-8') as file2:
            lines_file1 = file1.readlines()
            lines_file2 = file2.readlines()

            mismatched_lines = []

            # Compare each line in file1 to all lines in file2
            for line_num, line1 in enumerate(lines_file1, start=1):
                line1 = line1.strip()  # Remove leading/trailing whitespace
                found_match = False

                for line_num2, line2 in enumerate(lines_file2, start=1):
                    line2 = line2.strip()  # Remove leading/trailing whitespace

                    # Perform a case-insensitive comparison
                    if line1.lower() == line2.lower():
                        found_match = True
                        break

                if not found_match:
                    mismatched_lines.append(f"Line {line_num} in File 1: '{line1}' has no match in File 2")

            # Compare each line in file2 to all lines in file1 (vice versa)
            for line_num2, line2 in enumerate(lines_file2, start=1):
                line2 = line2.strip()  # Remove leading/trailing whitespace
                found_match = False

                for line_num, line1 in enumerate(lines_file1, start=1):
                    line1 = line1.strip()  # Remove leading/trailing whitespace

                    # Perform a case-insensitive comparison
                    if line2.lower() == line1.lower():
                        found_match = True
                        break

                if not found_match:
                    mismatched_lines.append(f"Line {line_num2} in File 2: '{line2}' has no match in File 1")

            return mismatched_lines

    except FileNotFoundError:
        print("One or both files not found.")
        return []

# Paths to the two text files you want to compare
file1_path = r'C:\Python Space\T1.txt'
file2_path = r'C:\Python Space\T2.txt'

mismatched_lines = compare_lines_in_files(file1_path, file2_path)

if mismatched_lines:
    print("Differences between the files:")
    for line in mismatched_lines:
        print(line)
else:
    print("No differences found between the files.")
# -*- coding: utf-8 -*-
"""
   

"""

def compare_lines_in_files(file1_path, file2_path):
    try:
        with open(file1_path, 'r', encoding='utf-8') as file1, open(file2_path, 'r', encoding='utf-8') as file2:
            lines_file1 = file1.readlines()
            lines_file2 = file2.readlines()

            mismatched_lines = []

            # Compare each line in file1 to all lines in file2
            for line_num, line1 in enumerate(lines_file1, start=1):
                line1 = line1.strip()  # Remove leading/trailing whitespace
                found_match = False

                for line_num2, line2 in enumerate(lines_file2, start=1):
                    line2 = line2.strip()  # Remove leading/trailing whitespace

                    # Perform a case-insensitive comparison
                    if line1.lower() == line2.lower():
                        found_match = True
                        break

                if not found_match:
                    mismatched_lines.append(f"Line {line_num} in File 1: '{line1}' has no match in File 2")

            # Compare each line in file2 to all lines in file1 (vice versa)
            for line_num2, line2 in enumerate(lines_file2, start=1):
                line2 = line2.strip()  # Remove leading/trailing whitespace
                found_match = False

                for line_num, line1 in enumerate(lines_file1, start=1):
                    line1 = line1.strip()  # Remove leading/trailing whitespace

                    # Perform a case-insensitive comparison
                    if line2.lower() == line1.lower():
                        found_match = True
                        break

                if not found_match:
                    mismatched_lines.append(f"Line {line_num2} in File 2: '{line2}' has no match in File 1")

            return mismatched_lines

    except FileNotFoundError:
        print("One or both files not found.")
        return []

# Paths to the two text files you want to compare
file1_path = r'C:\Python Space\T1.txt'
file2_path = r'C:\Python Space\T2.txt'

mismatched_lines = compare_lines_in_files(file1_path, file2_path)

if mismatched_lines:
    print("Differences between the files:")
    for line in mismatched_lines:
        print(line)
else:
    print("No differences found between the files.")
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文