在Java中比较两个Excel文件的最简单方法?

发布于 2024-07-19 13:50:27 字数 185 浏览 4 评论 0原文

我正在为一些生成 Excel 文件(二进制文件)的代码编写 JUnit 测试。 我有另一个 Excel 文件,其中包含我的预期输出。 将实际文件与预期文件进行比较的最简单方法是什么?

当然,我可以自己编写代码,但我想知道受信任的第三方库(例如 Spring 或 Apache Commons)中是否存在已经执行此操作的现有方法。

I'm writing a JUnit test for some code that produces an Excel file (which is binary). I have another Excel file that contains my expected output. What's the easiest way to compare the actual file to the expected file?

Sure I could write the code myself, but I was wondering if there's an existing method in a trusted third-party library (e.g. Spring or Apache Commons) that already does this.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

慵挽 2024-07-26 13:50:27

您可以考虑使用我的项目 simple-excel 它提供了一堆 Hamcrest Matchers 来完成这项工作。

,当您执行以下操作时,

assertThat(actual, WorkbookMatcher.sameWorkbook(expected));

例如

java.lang.AssertionError:
Expected: entire workbook to be equal
     but: cell at "C14" contained <"bananas"> expected <nothing>,
          cell at "C15" contained <"1,850,000 EUR"> expected <"1,850,000.00 EUR">,
          cell at "D16" contained <nothing> expected <"Tue Sep 04 06:30:00">
    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)

您会看到,这样,您可以从自动化测试中运行它,并在开发时获得有意义的反馈。

您可以阅读这篇关于我的网站

You might consider using my project simple-excel which provides a bunch of Hamcrest Matchers to do the job.

When you do something like the following,

assertThat(actual, WorkbookMatcher.sameWorkbook(expected));

You'd see, for example,

java.lang.AssertionError:
Expected: entire workbook to be equal
     but: cell at "C14" contained <"bananas"> expected <nothing>,
          cell at "C15" contained <"1,850,000 EUR"> expected <"1,850,000.00 EUR">,
          cell at "D16" contained <nothing> expected <"Tue Sep 04 06:30:00">
    at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)

That way, you can run it from your automatted tests and get meaningful feedback whilst you're developing.

You can read more about it at this article on my site

a√萤火虫的光℡ 2024-07-26 13:50:27

这就是我最终所做的(其中繁重的工作由 DBUnit 完成):

/**
 * Compares the data in the two Excel files represented by the given input
 * streams, closing them on completion
 * 
 * @param expected can't be <code>null</code>
 * @param actual can't be <code>null</code>
 * @throws Exception
 */
private void compareExcelFiles(InputStream expected, InputStream actual)
  throws Exception
{
  try {
    Assertion.assertEquals(new XlsDataSet(expected), new XlsDataSet(actual));
  }
  finally {
    IOUtils.closeQuietly(expected);
    IOUtils.closeQuietly(actual);
  }
}

这比较了两个中的数据文件,不存在因任何可能不同的不相关元数据而导致误报的风险。 希望这对某人有帮助。

Here's what I ended up doing (with the heavy lifting being done by DBUnit):

/**
 * Compares the data in the two Excel files represented by the given input
 * streams, closing them on completion
 * 
 * @param expected can't be <code>null</code>
 * @param actual can't be <code>null</code>
 * @throws Exception
 */
private void compareExcelFiles(InputStream expected, InputStream actual)
  throws Exception
{
  try {
    Assertion.assertEquals(new XlsDataSet(expected), new XlsDataSet(actual));
  }
  finally {
    IOUtils.closeQuietly(expected);
    IOUtils.closeQuietly(actual);
  }
}

This compares the data in the two files, with no risk of false negatives from any irrelevant metadata that might be different. Hope this helps someone.

朕就是辣么酷 2024-07-26 13:50:27

使用一些校验和(如 MD5)或仅读取两个文件可以轻松完成简单的文件比较。

然而,正如 James Burgess 指出的那样,由于 Excel 文件包含大量元数据,这些文件可能永远不会逐字节相同。
因此,您的测试需要另一种比较。

我建议以某种方式从 Excel 文件生成“规范”表单,即读取生成的 Excel 文件并将其转换为更简单的格式(CSV 或类似格式),这只会保留您想要检查的信息。 然后您可以使用“规范形式”与您的预期结果进行比较(当然也是规范形式)。

Apache POI 可能对于读取该文件很有用。

顺便说一句:读取整个文件来检查其正确性通常不会被视为单元测试。 这是一个集成测试...

A simple file comparison can easily be done using some checksumming (like MD5) or just reading both files.

However, as Excel files contain loads of metadata, the files will probably never be identical byte-for-byte, as James Burgess pointed out.
So you'll need another kind of comparison for your test.

I'd recommend somehow generating a "canonical" form from the Excel file, i.e. reading the generated Excel file and converting it to a simpler format (CSV or something similar), which will only retain the information you want to check. Then you can use the "canonical form" to compare with your expected result (also in canonical form, of course).

Apache POI might be useful for reading the file.

BTW: Reading a whole file to check its correctnes would generally not be considere a Unit test. That's an integration test...

べ映画 2024-07-26 13:50:27

我需要做类似的事情,并且已经在我的项目中使用 Apache POI 库 来创建 Excel 文件。 所以我选择使用包含的 ExcelExtractor 界面将两个工作簿导出为文本字符串并断言这些字符串相等。 .xls 的 HSSF 都有实现以及 XSSF for .xlsx

转储到字符串:

XSSFWorkbook xssfWorkbookA = ...;
String workbookA = new XSSFExcelExtractor(xssfWorkbookA).getText();

ExcelExtractor 有一些选项,用于确定字符串转储中应包含的所有内容。 我发现它具有包含工作表名称的有用默认值。 此外,它还包括单元格的文本内容。

I needed to do something similar and was already using the Apache POI library in my project to create Excel files. So I opted to use the included ExcelExtractor interface to export both workbooks as a string of text and asserted that the strings were equal. There are implementations for both HSSF for .xls as well as XSSF for .xlsx.

Dump to string:

XSSFWorkbook xssfWorkbookA = ...;
String workbookA = new XSSFExcelExtractor(xssfWorkbookA).getText();

ExcelExtractor has some options for what all should be included in the string dump. I found it to have useful defaults of including sheet names. In addition it includes the text contents of the cells.

轻许诺言 2024-07-26 13:50:27

我发现最简单的方法是使用 Tika。
我这样使用它:

private void compareXlsx(File expected, File result) throws IOException, TikaException {
     Tika tika = new Tika();
     String expectedText = tika.parseToString(expected);
     String resultText = tika.parseToString(result);
     assertEquals(expectedText, resultText);
}


<dependency>
    <groupId>org.apache.tika</groupId>
    <artifactId>tika-parsers</artifactId>
    <version>1.13</version>
    <scope>test</scope>
</dependency>

The easiest way I find is to use Tika.
I use it like this:

private void compareXlsx(File expected, File result) throws IOException, TikaException {
     Tika tika = new Tika();
     String expectedText = tika.parseToString(expected);
     String resultText = tika.parseToString(result);
     assertEquals(expectedText, resultText);
}


<dependency>
    <groupId>org.apache.tika</groupId>
    <artifactId>tika-parsers</artifactId>
    <version>1.13</version>
    <scope>test</scope>
</dependency>
一花一树开 2024-07-26 13:50:27

您可以使用 javaxdelta 来检查两个文件是否相同。 它可以从这里获得:

http://javaxdelta.sourceforge.net/

You could use javaxdelta to check whether the two files are the same. It's available from here:

http://javaxdelta.sourceforge.net/

椒妓 2024-07-26 13:50:27

刚刚发现 commons-io 中有一些东西 FileUtils。 感谢其他答案。

Just found out there's something in commons-io's FileUtils. Thanks for the other answers.

新一帅帅 2024-07-26 13:50:27

您可以使用Beyond Compare 3,它可以从命令行启动,支持不同方式比较Excel文件,包括:

  • 将 Excel 工作表与数据库表进行比较
  • 检查所有文本内容
  • 通过某些格式检查文本内容

You may use Beyond Compare 3 which can be started from command-line and supports different ways to compare Excel files, including:

  • Comparing Excel sheets as database tables
  • Checking all textual content
  • Checking textual content with some formating
旧人 2024-07-26 13:50:27

仅测试 Kotlin 中第一个工作表的内容(可以轻松转换为 java)。

private fun checkEqualityExcelDocs(doc : XSSFWorkbook, doc1 : XSSFWorkbook) : Boolean{
        val mapOfCellDoc = doc.toList().first().toList().flatMap { row -> row.map { Pair(PivotExcelCreator.IndexInThePivotTable(it.rowIndex,it.columnIndex),it.stringCellValue) }}.toMap()
        val mapOfCellDoc1 = doc1.toList().first().toList().flatMap { row -> row.map { Pair(PivotExcelCreator.IndexInThePivotTable(it.rowIndex,it.columnIndex),it.stringCellValue) }}.toMap()
        if(mapOfCellDoc.size == mapOfCellDoc1.size){
            return mapOfCellDoc.entries.all { mapOfCellDoc1.containsKey(it.key) && mapOfCellDoc[it.key] == mapOfCellDoc1[it.key]}
        }
        return false
    }

data class IndexInThePivotTable(val row: Int, val col: Int)

并在代码中添加断言,

    assertTrue(checkEqualityExcelDocs(expected, actual), "Docs aren't equal!")

如您所见, doc.toList().first() 将仅获取文档的第一张纸,如果您需要分别比较每张纸,请稍微更改代码。

另外,不考虑“”空字符串单元格也是一个好主意,我不需要此功能(如果需要,也只需添加此部分)。


它还可以是有用的信息

//first doc I've got from outputstream such way
val out = ByteArrayOutputStream()
//some method which writes excel to outputstream
val firstDoc = XSSFWorkbook(ByteArrayInputStream(out.toByteArray()))

和文件中的第二个文档以进行比较

val secondDoc = XSSFWorkbook(Test::class.java.getClassLoader().getResource("yourfile.xlsx").path)

To test only content of the first sheets in Kotlin (easily can be converted to java).

private fun checkEqualityExcelDocs(doc : XSSFWorkbook, doc1 : XSSFWorkbook) : Boolean{
        val mapOfCellDoc = doc.toList().first().toList().flatMap { row -> row.map { Pair(PivotExcelCreator.IndexInThePivotTable(it.rowIndex,it.columnIndex),it.stringCellValue) }}.toMap()
        val mapOfCellDoc1 = doc1.toList().first().toList().flatMap { row -> row.map { Pair(PivotExcelCreator.IndexInThePivotTable(it.rowIndex,it.columnIndex),it.stringCellValue) }}.toMap()
        if(mapOfCellDoc.size == mapOfCellDoc1.size){
            return mapOfCellDoc.entries.all { mapOfCellDoc1.containsKey(it.key) && mapOfCellDoc[it.key] == mapOfCellDoc1[it.key]}
        }
        return false
    }

data class IndexInThePivotTable(val row: Int, val col: Int)

and in your code add assert

    assertTrue(checkEqualityExcelDocs(expected, actual), "Docs aren't equal!")

as you can see doc.toList().first() will take only the first sheet of document, if you need to compare each sheet respectively change code a little.

Also it is quite good idea to not take into account "" empty strings cells, I didn't need this functionality (As well, simply add this part, if you need).


also it can be useful information

//first doc I've got from outputstream such way
val out = ByteArrayOutputStream()
//some method which writes excel to outputstream
val firstDoc = XSSFWorkbook(ByteArrayInputStream(out.toByteArray()))

and second doc from file to compare with

val secondDoc = XSSFWorkbook(Test::class.java.getClassLoader().getResource("yourfile.xlsx").path)
长发绾君心 2024-07-26 13:50:27

经过长期研究后对我来说最好的方法:
使用 Apachi POI 作为外部库
对 xlsx 文件使用 XSSF。
如果你想比较 xls 文件必须使用 HSSF

祝你好运

public static void compareTwoExcellFiles04(String file01, String file02) throws IOException {
    try
    {
        File file1 = new File(file01);
        File file2 = new File(file02);
        FileInputStream fis1 = new FileInputStream(file1);
        FileInputStream fis2 = new FileInputStream(file2);
        XSSFWorkbook wb1 = new XSSFWorkbook(fis1);
        XSSFWorkbook wb2 = new XSSFWorkbook(fis2);
        XSSFSheet sheet1 = wb1.getSheetAt(0);
        XSSFSheet sheet2 = wb2.getSheetAt(0);
        ArrayList<String> arrayList1 = new ArrayList<>();
        ArrayList<String> arrayList2 = new ArrayList<>();
        for (Row row1 : sheet1) {
            Iterator<Cell> cellIterator = row1.cellIterator();
            while (cellIterator.hasNext()) {
                Cell cell1 = cellIterator.next();
                arrayList1.add(cell1.getStringCellValue());
                }
            }
        for (Row row2 : sheet2) {
            Iterator<Cell> cellIterator = row2.cellIterator();
            while (cellIterator.hasNext()) {
                Cell cell2 = cellIterator.next();
                arrayList2.add(cell2.getStringCellValue());
            }
        }
    if (arrayList1.equals(arrayList2)){
        System.out.println("the files are equal");
    }else{
        System.out.println("NOT EQUAL");
    }
    }
    catch(Exception e)
    {
        e.printStackTrace();
    }
}

Best way for me after a long research:
Using Apachi POI as external library
Using XSSF for xlsx files.
If you want to compare xls files have to use HSSF

Good luck

public static void compareTwoExcellFiles04(String file01, String file02) throws IOException {
    try
    {
        File file1 = new File(file01);
        File file2 = new File(file02);
        FileInputStream fis1 = new FileInputStream(file1);
        FileInputStream fis2 = new FileInputStream(file2);
        XSSFWorkbook wb1 = new XSSFWorkbook(fis1);
        XSSFWorkbook wb2 = new XSSFWorkbook(fis2);
        XSSFSheet sheet1 = wb1.getSheetAt(0);
        XSSFSheet sheet2 = wb2.getSheetAt(0);
        ArrayList<String> arrayList1 = new ArrayList<>();
        ArrayList<String> arrayList2 = new ArrayList<>();
        for (Row row1 : sheet1) {
            Iterator<Cell> cellIterator = row1.cellIterator();
            while (cellIterator.hasNext()) {
                Cell cell1 = cellIterator.next();
                arrayList1.add(cell1.getStringCellValue());
                }
            }
        for (Row row2 : sheet2) {
            Iterator<Cell> cellIterator = row2.cellIterator();
            while (cellIterator.hasNext()) {
                Cell cell2 = cellIterator.next();
                arrayList2.add(cell2.getStringCellValue());
            }
        }
    if (arrayList1.equals(arrayList2)){
        System.out.println("the files are equal");
    }else{
        System.out.println("NOT EQUAL");
    }
    }
    catch(Exception e)
    {
        e.printStackTrace();
    }
}
陪你搞怪i 2024-07-26 13:50:27

也许...比较每个文件的 MD5 摘要? 我确信有很多方法可以做到这一点。 您可以打开两个文件并比较每个字节。

编辑:James 说明了 XLS 格式在元数据中可能存在差异。 也许您应该使用与生成 xls 文件相同的界面来打开它们并比较单元格之间的值?

Maybe... compare MD5 digests of each file? I'm sure there are a lot of ways to do it. You could just open both files and compare each byte.

EDIT: James stated how the XLS format might have differences in the metadata. Perhaps you should use the same interface you used to generate the xls files to open them and compare the values from cell to cell?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文