java FileInputStream - 基于文件对象引用方式的差异:类加载器/文件系统

发布于 2024-08-10 20:52:18 字数 1716 浏览 5 评论 0 原文

我正在使用 apache POI 从 Excel 文件中提取一些数据。
我需要一个 InputStream 来实例化 POI HSSFWorkbook 类
HSSFWorkbook wb = new HSSFWorkbook(inputStreamX);

如果我尝试构造 InputStream 对象,就像

    InputStream inputStream = new FileInputStream(new File("/home/xxx/workspace/myproject/test/resources/importTest.xls"));        
    InputStream inputStream2 = new FileInputStream(getClass().getResource("/importTest.xls").getFile());
    InputStream inputStream3 = new ClassPathResource("importTest.xls").getInputStream();

如果我使用 inputStream 构造 POI 对象一样,我会发现差异,它工作得很好。
但是 inputStream2 和 inputStream3 抛出此异常

java.io.IOException: Invalid header signature; read -2300849302551019537, expected -2226271756974174256
    at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:100)
    at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:84)

似乎二进制文件的标头不同,并且库无法将其识别为 Excel 文件。我不明白为什么。
我看到的唯一区别是 inputStream2 和 inputStream2 3 正在使用类加载器来定位文件。 (ClassPathResource是一个 Spring 类)。

我希望将文件路径与系统分开。所以我更喜欢 inputStream2 或 3 这样的东西。

你知道为什么会发生这种情况吗?

谢谢

更新:
我尝试将inputStream和inputStream2写入磁盘。
inputStream附带的excel文件是好的。 inputStream2 包含一个 Excel 文件,其中包含一些奇怪的字符,这些字符包装了真实的内容。
看来 maven 在构建过程中以某种方式损坏了 excel 文件。
所以基本上是我用 classLoader 检索的文件(在 /home/xxx/workspace/myproject/target/test-classes/importTest.xls 下)是不行的。
有什么想法吗?

I'm using apache POI to extract some data from an excel file.
I need an InputStream to instantiate the POI HSSFWorkbook class
HSSFWorkbook wb = new HSSFWorkbook(inputStreamX);

I'm finding differences if I try to construct the InputStream object like

    InputStream inputStream = new FileInputStream(new File("/home/xxx/workspace/myproject/test/resources/importTest.xls"));        
    InputStream inputStream2 = new FileInputStream(getClass().getResource("/importTest.xls").getFile());
    InputStream inputStream3 = new ClassPathResource("importTest.xls").getInputStream();

If I construct the POI object with inputStream it works fine.
But inputStream2 and inputStream3 are throwing this exception

java.io.IOException: Invalid header signature; read -2300849302551019537, expected -2226271756974174256
    at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:100)
    at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:84)

It seems that the header of the binary file is different and the library can't recognize it as an Excel file. I can't understand why.
The only difference I see is that inputStream2 & 3 are using the classloader to locate the file. (ClassPathResource is a Spring class).

I'd like to have the file path separated from the system. So I would prefer something like inputStream2 or 3.

Do you have any idea on why this is happening?

Thank you

Update:
I tried writing to disk the inputStream and inputStream2.
The excel file that comes with inputStream is Ok. inputStream2 contains an excel file with some strange characters that wrap the real content.
It seems that maven corrupts the excel file in some way during the build.
So it's basically the file I retrieve with the classLoader (under /home/xxx/workspace/myproject/target/test-classes/importTest.xls) that is not ok.
Any idea?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

情未る 2024-08-17 20:52:18

问题似乎是maven的过滤选项。
如果pom 看起来像这样

           <testResource>
                <directory>${basedir}/src/test/resources</directory>
                <includes>
                    <include>**/*.xml</include>
                    <include>**/*.properties</include>
                    <include>**/*.sql</include>
                    <include>**/*.xls</include>
                </includes>
                <filtering>true</filtering>
            </testResource>

当 xls 文件上的过滤选项设置为 true 时,它​​会损坏它们。

The problem seems maven's filtering option.
If the pom looks like this

           <testResource>
                <directory>${basedir}/src/test/resources</directory>
                <includes>
                    <include>**/*.xml</include>
                    <include>**/*.properties</include>
                    <include>**/*.sql</include>
                    <include>**/*.xls</include>
                </includes>
                <filtering>true</filtering>
            </testResource>

When the filtering option is set to true on xls files it corrupts them.

是你 2024-08-17 20:52:18

您是否尝试过 ClassLoader#getResourceAsStream(String)?它的行为可能类似于您使用 Class#getResource(String),如后者文档中提到的。

我的第一个想法是没有找到这样的文件,但如果每次运行程序时它始终读取相同的值(-2300849302551019537),则表明确实有一个文件正在被读取。初始化 InputStream 后捕获该语句并在调试器中检查流实例。您应该能够找到对基础文件名的引用。为了让这个过程更容易,请尝试使用 ClassLoader#getResources(String) 并检查返回的 URL 序列。

Have you tried ClassLoader#getResourceAsStream(String)? It will probably behave similarly to your second attempt using Class#getResource(String), as alluded to in the latter's documentation.

My first thought here was that no such file was found, but if it's consistently reading the same value (-2300849302551019537) each time you run the program, that suggests there really is a file there that's being read. Trap the statement after you initialize your InputStream and inspect the stream instance in the debugger. You should be able to find a reference to the underlying file name. To make this easier at first, try using ClassLoader#getResources(String) and inspect the sequence of URLs returned.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文