Java:从InputStream读取并不总是读取相同数量的数据

发布于 2024-12-11 13:52:50 字数 786 浏览 0 评论 0原文

无论好坏,我一直在使用如下代码,没有任何问题:

ZipFile aZipFile = new ZipFile(fileName);   
InputStream zipInput = aZipFile.getInputStream(name);  
int theSize = zipInput.available();  
byte[] content = new byte[theSize];  
zipInput.read(content, 0, theSize);

我已经使用了它(这种获取可用大小并直接读取字节缓冲区的逻辑) 对于 File I/O 没有任何问题,我也将它与 zip 文件一起使用。

但最近我遇到了一个情况,zipInput.read(content, 0, theSize); 实际上读取的字节数比可用的 theSize 少了 3 个字节。

由于代码不在循环中检查 zipInput.read(content, 0, theSize); 返回的长度,所以我读取了最后 3 个字节丢失的文件
后来程序无法正常运行(该文件是二进制文件)。

奇怪的是,对于较大尺寸的不同 zip 文件,例如 1075 字节(在我的例子中,有问题的 zip 条目是 867 字节),代码工作正常!

我知道代码的逻辑可能不是“最好的”,但为什么我现在突然遇到这个问题?

如果我立即使用更大的 zip 条目运行该程序,它为什么会起作用呢?

非常欢迎任何意见

谢谢

For good or bad I have been using code like the following without any problems:

ZipFile aZipFile = new ZipFile(fileName);   
InputStream zipInput = aZipFile.getInputStream(name);  
int theSize = zipInput.available();  
byte[] content = new byte[theSize];  
zipInput.read(content, 0, theSize);

I have used it (this logic of obtaining the available size and reading directly to a byte buffer)
for File I/O without any issues and I used it with zip files as well.

But recently I stepped into a case that the zipInput.read(content, 0, theSize); actually reads 3 bytes less that the theSize available.

And since the code is not in a loop to check the length returned by zipInput.read(content, 0, theSize); I read the file with the 3 last bytes missing
and later the program can not function properly (the file is a binary file).

Strange enough with different zip files of larger size e.g. 1075 bytes (in my case the problematic zip entry is 867 bytes) the code works fine!

I understand that the logic of the code is probably not the "best" but why am I suddenly getting this problem now?

And how come if I run the program immediately with a larger zip entry it works?

Any input is highly welcome

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

薆情海 2024-12-18 13:52:50

InputStream 读取 API 文档:

尝试读取 len 个字节,但数量较少
可以阅读。

... 和:

返回:读入缓冲区的总字节数,如果
没有更多数据,因为已到达流末尾。

换句话说,除非 read 方法返回 -1,否则仍有更多数据可供读取,但您不能保证 read准确读取指定的字节数。指定的字节数是描述它将读取的最大数据量的上限

From the InputStream read API docs:

An attempt is made to read as many as len bytes, but a smaller number
may be read.

... and:

Returns: the total number of bytes read into the buffer, or -1 if
there is no more data because the end of the stream has been reached.

In other words unless the read method returns -1 there is still more data available to read, but you cannot guarantee that read will read exactly the specified number of bytes. The specified number of bytes is the upper bound describing the maximum amount of data it will read.

菩提树下叶撕阳。 2024-12-18 13:52:50

使用 available() 并不能保证它计算到流末尾的总可用字节数。
参考Java InputStreamavailable() 方法。它说

返回可以从此输入流读取(或跳过)的字节数的估计值,而不会被下次调用该输入流的方法阻塞。下一次调用可能是同一个线程或另一个线程。单次读取或跳过这么多字节不会阻塞,但可能会读取或跳过更少的字节。

请注意,虽然 InputStream 的某些实现将返回流中的字节总数,但许多实现不会返回。使用此方法的返回值来分配用于保存此流中所有数据的缓冲区永远是不正确的。

您的问题的示例解决方案如下:

ZipFile aZipFile = new ZipFile(fileName);   
InputStream zipInput = aZipFile.getInputStream( caImport );  
int available = zipInput.available();  
byte[] contentBytes = new byte[ available ];  
while ( available != 0 )   
{   
    zipInput.read( contentBytes );   
    // here, do what ever you want  
    available = dis.available();  
} // while available  
...   

这肯定适用于所有大小的输入文件。

Using available() does not guarantee that it counted total available bytes to the end of stream.
Refer to Java InputStream's available() method. It says that

Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream. The next invocation might be the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.

Note that while some implementations of InputStream will return the total number of bytes in the stream, many will not. It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream.

An example solution for your problem can be as follows:

ZipFile aZipFile = new ZipFile(fileName);   
InputStream zipInput = aZipFile.getInputStream( caImport );  
int available = zipInput.available();  
byte[] contentBytes = new byte[ available ];  
while ( available != 0 )   
{   
    zipInput.read( contentBytes );   
    // here, do what ever you want  
    available = dis.available();  
} // while available  
...   

This works for sure on all sizes of input files.

夜吻♂芭芘 2024-12-18 13:52:50

执行此操作的最佳方法应该如下所示:

public static byte[] readZipFileToByteArray(ZipFile zipFile, ZipEntry entry)
    throws IOException {
    InputStream in = null;
    try {
        in = zipFile.getInputStream(entry);
        return IOUtils.toByteArray(in);
    } finally {
        IOUtils.closeQuietly(in);
    }
}

其中 IOUtils.toByteArray(in) 方法不断读取直到 EOF,然后返回字节数组。

The best way to do this should be as bellows:

public static byte[] readZipFileToByteArray(ZipFile zipFile, ZipEntry entry)
    throws IOException {
    InputStream in = null;
    try {
        in = zipFile.getInputStream(entry);
        return IOUtils.toByteArray(in);
    } finally {
        IOUtils.closeQuietly(in);
    }
}

where the IOUtils.toByteArray(in) method keeps reading until EOF and then return the byte array.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文