在 Ruby 中连续读取外部进程的 STDOUT

发布于 2024-07-27 22:35:31 字数 662 浏览 6 评论 0原文

我想通过 ruby​​ 脚本从命令行运行 Blender,然后该脚本将逐行处理 Blender 给出的输出,以更新 GUI 中的进度条。 搅拌机是我需要读取其标准输出的外部进程并不重要。

当搅拌机进程仍在运行时,我似乎无法捕获搅拌机通常打印到外壳的进度消息,并且我尝试了几种方法。 我似乎总是在搅拌机退出后访问搅拌机的标准输出,而不是在它仍在运行时访问。

这是一个失败的尝试的例子。 它确实获取并打印 Blender 输出的前 25 行,但只有在 Blender 进程退出之后:

blender = nil
t = Thread.new do
  blender = open "| blender -b mball.blend -o //renders/ -F JPEG -x 1 -f 1"
end
puts "Blender is doing its job now..."
25.times { puts blender.gets}

编辑:

为了更清楚一点,调用 Blender 的命令返回一个输出流在 shell 中,指示进度(第 1-16 部分已完成等)。 似乎任何对“获取”输出的调用都会被阻止,直到搅拌机退出。 问题是如何在 Blender 仍在运行时访问此输出,因为 Blender 将其输出打印到 shell。

I want to run blender from the command line through a ruby script, which will then process the output given by blender line by line to update a progress bar in a GUI. It's not really important that blender is the external process whose stdout I need to read.

I can't seem to be able to catch the progress messages blender normally prints to the shell when the blender process is still running, and I've tried a few ways. I always seem to access the stdout of blender after blender has quit, not while it's still running.

Here's an example of a failed attempt. It does get and print the first 25 lines of the output of blender, but only after the blender process has exited:

blender = nil
t = Thread.new do
  blender = open "| blender -b mball.blend -o //renders/ -F JPEG -x 1 -f 1"
end
puts "Blender is doing its job now..."
25.times { puts blender.gets}

Edit:

To make it a little clearer, the command invoking blender gives back a stream of output in the shell, indicating progress (part 1-16 completed etc). It seems that any call to "gets" the output is blocked until blender quits. The issue is how to get access to this output while blender is still running, as blender prints it's output to shell.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

预谋 2024-08-03 22:35:31

我在解决这个问题方面取得了一些成功。 以下是详细信息和一些解释,以防遇到类似问题的人找到此页面。 但如果您不关心细节,这里是简短的答案

按以下方式使用 PTY.spawn(当然使用您自己的命令):

require 'pty'
cmd = "blender -b mball.blend -o //renders/ -F JPEG -x 1 -f 1" 
begin
  PTY.spawn( cmd ) do |stdout, stdin, pid|
    begin
      # Do stuff with the output here. Just printing to show it works
      stdout.each { |line| print line }
    rescue Errno::EIO
      puts "Errno:EIO error, but this probably just means " +
            "that the process has finished giving output"
    end
  end
rescue PTY::ChildExited
  puts "The child process exited!"
end

这里是长答案 >,有太多细节:

真正的问题似乎是,如果一个进程没有显式刷新其 stdout,那么写入 stdout 的任何内容都会被缓冲而不是实际发送,直到进程完成,以便最大限度地减少 IO (这显然是许多C库的实现细节,其目的是通过不那么频繁的IO来最大化吞吐量)。 如果您可以轻松修改该过程,以便定期刷新标准输出,那么这将是您的解决方案。 就我而言,它是搅拌机,因此对于像我这样的菜鸟来说修改源代码有点令人生畏。

但是,当您从 shell 运行这些进程时,它们会实时向 shell 显示 stdout,并且 stdout 似乎没有被缓冲。 我相信它只有在从另一个进程调用时才会被缓冲,但是如果正在处理 shell,则标准输出会实时显示,无缓冲。

这种行为甚至可以通过 ruby​​ 进程作为子进程来观察,其输出必须实时收集。 只需创建一个脚本 random.rb,其中包含以下行:

5.times { |i| sleep( 3*rand ); puts "#{i}" }

然后是一个 ruby​​ 脚本来调用它并返回其输出:

IO.popen( "ruby random.rb") do |random|
  random.each { |line| puts line }
end

您会发现您没有像您期望的那样实时获得结果,但全部在之后一次。 STDOUT 正在缓冲,即使您自己运行 random.rb,它也不会被缓冲。 这可以通过在 random.rb 的块内添加 STDOUT.flush 语句来解决。 但如果您无法更改来源,则必须解决此问题。 您无法从进程外部冲洗它。

如果子进程可以实时打印到 shell,那么也必须有一种方法可以使用 Ruby 实时捕获它。 确实有。 你必须使用 PTY 模块,我相信它包含在 ruby​​ 核心中(无论如何都是 1.8.6)。 可悲的是它没有记录在案。 但幸运的是我找到了一些使用的例子。

首先解释一下PTY是什么,它代表伪终端。 基本上,它允许 ruby​​ 脚本将自身呈现给子进程,就好像它是刚刚在 shell 中键入命令的真实用户一样。 因此,仅当用户通过 shell 启动进程时才会发生任何改变的行为(例如在本例中 STDOUT 未被缓冲)。 隐藏另一个进程已启动此进程的事实允许您实时收集 STDOUT,因为它没有被缓冲。

要使用 random.rb 脚本作为子脚本来实现此操作,请尝试以下代码:

require 'pty'
begin
  PTY.spawn( "ruby random.rb" ) do |stdout, stdin, pid|
    begin
      stdout.each { |line| print line }
    rescue Errno::EIO
    end
  end
rescue PTY::ChildExited
  puts "The child process exited!"
end

I've had some success in solving this problem of mine. Here are the details, with some explanations, in case anyone having a similar problem finds this page. But if you don't care for details, here's the short answer:

Use PTY.spawn in the following manner (with your own command of course):

require 'pty'
cmd = "blender -b mball.blend -o //renders/ -F JPEG -x 1 -f 1" 
begin
  PTY.spawn( cmd ) do |stdout, stdin, pid|
    begin
      # Do stuff with the output here. Just printing to show it works
      stdout.each { |line| print line }
    rescue Errno::EIO
      puts "Errno:EIO error, but this probably just means " +
            "that the process has finished giving output"
    end
  end
rescue PTY::ChildExited
  puts "The child process exited!"
end

And here's the long answer, with way too many details:

The real issue seems to be that if a process doesn't explicitly flush its stdout, then anything written to stdout is buffered rather than actually sent, until the process is done, so as to minimize IO (this is apparently an implementation detail of many C libraries, made so that throughput is maximized through less frequent IO). If you can easily modify the process so that it flushes stdout regularly, then that would be your solution. In my case, it was blender, so a bit intimidating for a complete noob such as myself to modify the source.

But when you run these processes from the shell, they display stdout to the shell in real-time, and the stdout doesn't seem to be buffered. It's only buffered when called from another process I believe, but if a shell is being dealt with, the stdout is seen in real time, unbuffered.

This behavior can even be observed with a ruby process as the child process whose output must be collected in real time. Just create a script, random.rb, with the following line:

5.times { |i| sleep( 3*rand ); puts "#{i}" }

Then a ruby script to call it and return its output:

IO.popen( "ruby random.rb") do |random|
  random.each { |line| puts line }
end

You'll see that you don't get the result in real-time as you might expect, but all at once afterwards. STDOUT is being buffered, even though if you run random.rb yourself, it isn't buffered. This can be solved by adding a STDOUT.flush statement inside the block in random.rb. But if you can't change the source, you have to work around this. You can't flush it from outside the process.

If the subprocess can print to shell in real-time, then there must be a way to capture this with Ruby in real-time as well. And there is. You have to use the PTY module, included in ruby core I believe (1.8.6 anyways). Sad thing is that it's not documented. But I found some examples of use fortunately.

First, to explain what PTY is, it stands for pseudo terminal. Basically, it allows the ruby script to present itself to the subprocess as if it's a real user who has just typed the command into a shell. So any altered behavior that occurs only when a user has started the process through a shell (such as the STDOUT not being buffered, in this case) will occur. Concealing the fact that another process has started this process allows you to collect the STDOUT in real-time, as it isn't being buffered.

To make this work with the random.rb script as the child, try the following code:

require 'pty'
begin
  PTY.spawn( "ruby random.rb" ) do |stdout, stdin, pid|
    begin
      stdout.each { |line| print line }
    rescue Errno::EIO
    end
  end
rescue PTY::ChildExited
  puts "The child process exited!"
end
一绘本一梦想 2024-08-03 22:35:31

使用IO.popen。 这是一个很好的例子。

你的代码会变成这样:

blender = nil
t = Thread.new do
  IO.popen("blender -b mball.blend -o //renders/ -F JPEG -x 1 -f 1") do |blender|
    blender.each do |line|
      puts line
    end
  end
end

use IO.popen. This is a good example.

Your code would become something like:

blender = nil
t = Thread.new do
  IO.popen("blender -b mball.blend -o //renders/ -F JPEG -x 1 -f 1") do |blender|
    blender.each do |line|
      puts line
    end
  end
end
逆光飞翔i 2024-08-03 22:35:31

STDOUT.flush 或
STDOUT.sync = true

STDOUT.flush or
STDOUT.sync = true

甜警司 2024-08-03 22:35:31

Blender 在结束程序之前可能不会打印换行符。 相反,它打印回车符 (\r)。 最简单的解决方案可能是寻找使用进度指示器打印换行符的神奇选项。

问题在于 IO#gets(以及其他各种 IO 方法)使用换行符作为分隔符。 他们将读取流,直到遇到“\n”字符(搅拌机不发送)。

尝试设置输入分隔符 $/ = "\r" 或使用 blender.gets("\r") 代替。

顺便说一句,对于此类问题,您应该始终检查 puts someobj.inspectp someobj (两者都执行相同的操作)以查看字符串中的任何隐藏字符。

Blender probably doesn't print line-breaks until it is ending the program. Instead, it is printing the carriage return character (\r). The easiest solution is probably searching for the magic option which prints line-breaks with the progress indicator.

The problem is that IO#gets (and various other IO methods) use the line break as a delimiter. They will read the stream until they hit the "\n" character (which blender isn't sending).

Try setting the input separator $/ = "\r" or using blender.gets("\r") instead.

BTW, for problems such as these, you should always check puts someobj.inspect or p someobj (both of which do the same thing) to see any hidden characters within the string.

痕至 2024-08-03 22:35:31

我不知道当时ehsanul是否回答了这个问题,有Open3::pipeline_rw() 尚可用,但它确实使事情变得更简单。

我不明白 ehsanul 在 Blender 中的工作,所以我用 tarxz 做了另一个例子。 tar 会将输入文件添加到 stdout 流,然后 xz 获取该 stdout 并再次将其压缩到另一个 stdout。 我们的工作是获取最后一个标准输出并将其写入最终文件:

require 'open3'

if __FILE__ == $0
    cmd_tar = ['tar', '-cf', '-', '-T', '-']
    cmd_xz = ['xz', '-z', '-9e']
    list_of_files = [...]

    Open3.pipeline_rw(cmd_tar, cmd_xz) do |first_stdin, last_stdout, wait_threads|
        list_of_files.each { |f| first_stdin.puts f }
        first_stdin.close

        # Now start writing to target file
        open(target_file, 'wb') do |target_file_io|
            while (data = last_stdout.read(1024)) do
                target_file_io.write data
            end
        end # open
    end # pipeline_rw
end

I don't know whether at the time ehsanul answered the question, there was Open3::pipeline_rw() available yet, but it really makes things simpler.

I don't understand ehsanul's job with Blender, so I made another example with tar and xz. tar will add input file(s) to stdout stream, then xz take that stdout and compress it, again, to another stdout. Our job is to take the last stdout and write it to our final file:

require 'open3'

if __FILE__ == $0
    cmd_tar = ['tar', '-cf', '-', '-T', '-']
    cmd_xz = ['xz', '-z', '-9e']
    list_of_files = [...]

    Open3.pipeline_rw(cmd_tar, cmd_xz) do |first_stdin, last_stdout, wait_threads|
        list_of_files.each { |f| first_stdin.puts f }
        first_stdin.close

        # Now start writing to target file
        open(target_file, 'wb') do |target_file_io|
            while (data = last_stdout.read(1024)) do
                target_file_io.write data
            end
        end # open
    end # pipeline_rw
end
情归归情 2024-08-03 22:35:31

老问题,但有类似的问题。

在没有真正改变我的 Ruby 代码的情况下,有帮助的一件事是用 stdbuf,如下所示:

cmd = "stdbuf -oL -eL -i0  openssl s_client -connect #{xAPI_ADDRESS}:#{xAPI_PORT}"

@xSess = IO.popen(cmd.split " ", mode = "w+")  

在我的示例中,我想要像 shell 一样与之交互的实际命令是 openssl

-oL -eL 告诉它仅缓冲 STDOUT 和 STDERR 到换行符。 将 L 替换为 0 以完全取消缓冲。

但这并不总是有效:有时目标进程会强制执行自己的流缓冲区类型,就像另一个答案指出的那样。

Old question, but had similar problems.

Without really changing my Ruby code, one thing that helped was wrapping my pipe with stdbuf, like so:

cmd = "stdbuf -oL -eL -i0  openssl s_client -connect #{xAPI_ADDRESS}:#{xAPI_PORT}"

@xSess = IO.popen(cmd.split " ", mode = "w+")  

In my example, the actual command I want to interact with as if it were a shell, is openssl.

-oL -eL tell it to buffer STDOUT and STDERR only upto a newline. Replace L with 0 to unbuffer completely.

This doesn't always work, though: sometimes the target process enforces its own stream buffer type, like another answer pointed out.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文