将大文件作为流发送到 process.getOutputStream

发布于 2024-10-01 17:29:06 字数 649 浏览 5 评论 0原文

我在 Windows 机器中使用 gzip 实用程序。我压缩了一个文件并作为 blob 存储在数据库中。当我想使用 gzip 实用程序解压缩此文件时，我将此字节流写入 process.getOutputStream。但超过30KB后，就无法读取该文件了。它挂在那里。

尝试使用内存参数、读取和刷新逻辑。但如果我尝试将相同的数据写入文件，速度会非常快。

 OutputStream stdin = proc.getOutputStream();
 Blob blob = Hibernate.createBlob(inputFileReader);
 InputStream source = blob.getBinaryStream();
 byte[] buffer = new byte[256];
 long readBufferCount = 0;
 while (source.read(buffer) > 0)
 {
  stdin.write(buffer);
  stdin.flush();
  log.info("Reading the file - Read bytes: " + readBufferCount);
  readBufferCount = readBufferCount + 256;
 }
 stdin.flush();

问候，马尼·库马尔·阿达里。

原文

I am using gzip utilities in windows machine. I compressed a file and stored in the DB as blob. When I want to decompress this file using gzip utility I am writing this byte stream to process.getOutputStream. But after 30KB, it was unable to read the file. It hangs there.

Tried with memory arguments, read and flush logic. But the same data if I try to write to a file it is pretty fast.

 OutputStream stdin = proc.getOutputStream();
 Blob blob = Hibernate.createBlob(inputFileReader);
 InputStream source = blob.getBinaryStream();
 byte[] buffer = new byte[256];
 long readBufferCount = 0;
 while (source.read(buffer) > 0)
 {
  stdin.write(buffer);
  stdin.flush();
  log.info("Reading the file - Read bytes: " + readBufferCount);
  readBufferCount = readBufferCount + 256;
 }
 stdin.flush();

Regards,
Mani Kumar Adari.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

和我恋爱吧 2024-10-08 17:29:06

我怀疑问题在于外部进程（连接到 proc ）要么

没有读取其标准输入，要么
正在向其标准输出写入 Java 应用程序没有读取的内容。

请记住，Java 使用一对“管道”与外部进程通信，并且这些管道的缓冲量有限。如果超出管道的缓冲容量，写入进程将被阻止写入管道，直到读取进程从管道读取足够的数据以腾出空间。如果读取器不读取，则管道将锁定。

如果您提供更多上下文（例如启动 gzip 进程的应用程序部分），我将能够更加明确。

跟进

gzip.exe 是我们使用的 Windows 中的一个 UNIX 实用程序。命令提示符下的 gzip.exe 工作正常。但java程序不行。有什么方法可以增加 java 写入管道的缓冲大小。我目前关心的是输入部分。

在 UNIX 上，gzip 实用程序通常使用以下两种方式之一：

gzip file 压缩 file，将其转换为 file.gz。
<代码>... |压缩包 | ... （或类似的东西）将其标准输入的压缩版本写入其标准输出。

我怀疑您正在做与后者相同的事情，即使用 java 应用程序作为 gzip 命令输入的源及其输出的目标。如果 java 应用程序没有正确实现，这正是可能发生锁定的情况。例如：

    Process proc = Runtime.exec(...);  // gzip.exe pathname.
    OutputStream out = proc.getOutputStream();
    while (...) {
        out.write(...);
    }
    out.flush();
    InputStream in = proc.getInputStream();
    while (...) {
        in.read(...);
    }

如果上面应用程序的写入阶段写入了太多数据，那么肯定会锁定。

java 应用程序和 gzip 之间的通信是通过两个管道进行的。正如我上面所说，管道将缓冲一定量的数据，但该量相对较小，并且肯定是有限的。这就是锁机的原因。发生的情况如下：

gzip 进程是通过一对将其连接到 Java 应用程序进程的管道创建的。
Java 应用程序将数据写入其 out 流。
gzip 进程从其标准输入读取该数据，对其进行压缩并写入其标准输出。
步骤 2. 和 3. 重复几次，直到最后 gzip 进程尝试写入其标准输出块。

所发生的情况是，gzip 一直在写入其输出管道，但没有从中读取任何内容。最终，我们到达了耗尽输出管道的缓冲区容量并且写入管道的时间点。

与此同时，Java 应用程序仍在写入 out Stream，经过几轮后，这也会阻塞，因为我们已经填充了另一个管道。

唯一的解决方案是 Java 应用程序同时读写。执行此操作的简单方法是创建第二个线程，并从一个线程写入外部进程并从另一个线程中的进程读取。

（更改 Java 缓冲或 Java 读/写大小不会有帮助。重要的缓冲是在管道的操作系统实现中，并且没有办法从纯 Java 中更改它（如果有的话）。）

I suspect that the problem is that the external process (connected to proc) is either

not reading its standard input, or
it is writing stuff to its standard output that your Java application is not reading.

Bear in mind that Java talks to the external process using a pair of "pipes", and these have a limited amount of buffering. If you exceed the buffering capacity of a pipe, the writer process will be blocked writing to the pipe until the reader process has read enough data from the pipe to make space. If the reader doesn't read, then the pipeline locks up.

If you provided more context (e.g. the part of the application that launches the gzip process) I'd be able to be more definitive.

FOLLOWUP

gzip.exe is a unix utility in windows we are using. gzip.exe in command prompt working fine. But Not with the java program. Is there any way we can increase the buffering size which java writes to a pipe. I am concerned about the input part at present.

On UNIX, the gzip utility is typically used one of two ways:

gzip file compresses file turning it into file.gz.
... | gzip | ... (or something similar) which writes a compressed version of its standard input to its standard output.

I suspect that you are doing the equivalent of the latter, with the java application as both the source of the gzip command's input and the destination of its output. And this is the precisely the scenario that can lock up ... if the java application is not implemented correctly. For instance:

    Process proc = Runtime.exec(...);  // gzip.exe pathname.
    OutputStream out = proc.getOutputStream();
    while (...) {
        out.write(...);
    }
    out.flush();
    InputStream in = proc.getInputStream();
    while (...) {
        in.read(...);
    }

If the write phase of the application above writes too much data, it is guaranteed to lockup.

Communication between the java application and gzip is via two pipes. As I stated above, a pipe will buffer a certain amount of data, but that amount is relatively small, and certainly bounded. This is the cause of the lockup. Here is what happens:

The gzip process is creates with a pair of pipes connecting it to the Java application process.
The Java application writes data to its out stream
The gzip processes reads that data from its standard input, compresses it and writes to its standard output.
Steps 2. and 3. are repeated a few times, until finally the gzip processes attempt to write to its standard output blocks.

What has been happening is that gzip has been writing into its output pipe, but nothing has been reading from it. Eventually, we reach the point where we've exhausted the output pipe's buffer capacity, and the write to the pipe blocks.

Meanwhile, the Java application is still writing to the out Stream, and after a couple more rounds, this too blocks because we've filled the other pipe.

The only solution is for the Java application to read and write at the same time. The simple way to do this is to create a second thread and do the writing to the external process from one thread and the reading from the process in the other one.

(Changing the Java buffering or the Java read / write sizes won't help. The buffering that matters is in the OS implementations of the pipes, and there's no way to change that from pure Java, if at all.)

回复收藏 0 原文

~没有更多了~