Java：从InputStream读取并不总是读取相同数量的数据

发布于 2024-12-11 13:52:50 字数 786 浏览 3 评论 0原文

无论好坏，我一直在使用如下代码，没有任何问题：

ZipFile aZipFile = new ZipFile(fileName);   
InputStream zipInput = aZipFile.getInputStream(name);  
int theSize = zipInput.available();  
byte[] content = new byte[theSize];  
zipInput.read(content, 0, theSize);

我已经使用了它（这种获取可用大小并直接读取字节缓冲区的逻辑）对于 File I/O 没有任何问题，我也将它与 zip 文件一起使用。

但最近我遇到了一个情况，zipInput.read(content, 0, theSize); 实际上读取的字节数比可用的 theSize 少了 3 个字节。

由于代码不在循环中检查 zipInput.read(content, 0, theSize); 返回的长度，所以我读取了最后 3 个字节丢失的文件
后来程序无法正常运行（该文件是二进制文件）。

奇怪的是，对于较大尺寸的不同 zip 文件，例如 1075 字节（在我的例子中，有问题的 zip 条目是 867 字节），代码工作正常！

我知道代码的逻辑可能不是“最好的”，但为什么我现在突然遇到这个问题？

如果我立即使用更大的 zip 条目运行该程序，它为什么会起作用呢？

非常欢迎任何意见

谢谢

原文

For good or bad I have been using code like the following without any problems:

ZipFile aZipFile = new ZipFile(fileName);   
InputStream zipInput = aZipFile.getInputStream(name);  
int theSize = zipInput.available();  
byte[] content = new byte[theSize];  
zipInput.read(content, 0, theSize);

I have used it (this logic of obtaining the available size and reading directly to a byte buffer)
for File I/O without any issues and I used it with zip files as well.

But recently I stepped into a case that the zipInput.read(content, 0, theSize); actually reads 3 bytes less that the theSize available.

And since the code is not in a loop to check the length returned by zipInput.read(content, 0, theSize); I read the file with the 3 last bytes missing
and later the program can not function properly (the file is a binary file).

Strange enough with different zip files of larger size e.g. 1075 bytes (in my case the problematic zip entry is 867 bytes) the code works fine!

I understand that the logic of the code is probably not the "best" but why am I suddenly getting this problem now?

And how come if I run the program immediately with a larger zip entry it works?

Any input is highly welcome

Thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

薆情海 2024-12-18 13:52:50

从InputStream 读取 API 文档：

尝试读取 len 个字节，但数量较少
可以阅读。

... 和：

返回：读入缓冲区的总字节数，如果
没有更多数据，因为已到达流末尾。

换句话说，除非 read 方法返回 -1，否则仍有更多数据可供读取，但您不能保证 read 将准确读取指定的字节数。指定的字节数是描述它将读取的最大数据量的上限。

回复收藏 0 原文

菩提树下叶撕阳。 2024-12-18 13:52:50

使用 available() 并不能保证它计算到流末尾的总可用字节数。
参考Java InputStream 的 available() 方法。它说

返回可以从此输入流读取（或跳过）的字节数的估计值，而不会被下次调用该输入流的方法阻塞。下一次调用可能是同一个线程或另一个线程。单次读取或跳过这么多字节不会阻塞，但可能会读取或跳过更少的字节。
请注意，虽然 InputStream 的某些实现将返回流中的字节总数，但许多实现不会返回。使用此方法的返回值来分配用于保存此流中所有数据的缓冲区永远是不正确的。

您的问题的示例解决方案如下：

ZipFile aZipFile = new ZipFile(fileName);   
InputStream zipInput = aZipFile.getInputStream( caImport );  
int available = zipInput.available();  
byte[] contentBytes = new byte[ available ];  
while ( available != 0 )   
{   
    zipInput.read( contentBytes );   
    // here, do what ever you want  
    available = dis.available();  
} // while available  
...

这肯定适用于所有大小的输入文件。

Using available() does not guarantee that it counted total available bytes to the end of stream.
Refer to Java InputStream's available() method. It says that

Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream. The next invocation might be the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.
Note that while some implementations of InputStream will return the total number of bytes in the stream, many will not. It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream.

An example solution for your problem can be as follows:

ZipFile aZipFile = new ZipFile(fileName);   
InputStream zipInput = aZipFile.getInputStream( caImport );  
int available = zipInput.available();  
byte[] contentBytes = new byte[ available ];  
while ( available != 0 )   
{   
    zipInput.read( contentBytes );   
    // here, do what ever you want  
    available = dis.available();  
} // while available  
...

This works for sure on all sizes of input files.

回复收藏 0 原文

夜吻♂芭芘 2024-12-18 13:52:50

执行此操作的最佳方法应该如下所示：

public static byte[] readZipFileToByteArray(ZipFile zipFile, ZipEntry entry)
    throws IOException {
    InputStream in = null;
    try {
        in = zipFile.getInputStream(entry);
        return IOUtils.toByteArray(in);
    } finally {
        IOUtils.closeQuietly(in);
    }
}

其中 IOUtils.toByteArray(in) 方法不断读取直到 EOF，然后返回字节数组。

The best way to do this should be as bellows:

public static byte[] readZipFileToByteArray(ZipFile zipFile, ZipEntry entry)
    throws IOException {
    InputStream in = null;
    try {
        in = zipFile.getInputStream(entry);
        return IOUtils.toByteArray(in);
    } finally {
        IOUtils.closeQuietly(in);
    }
}

where the IOUtils.toByteArray(in) method keeps reading until EOF and then return the byte array.

回复收藏 0 原文

~没有更多了~