在 ruby 中使用 PortAudio 包装器将声音录制到 .wav

发布于 2025-01-02 13:05:57 字数 1599 浏览 3 评论 0原文

我最近一直在使用 ruby，我决定开始一个简单的项目来编写一个 ruby 脚本，将线路输入声音记录到 .wav 文件中。我发现 ruby 不能很好地访问硬件设备（而且可能不应该），但 PortAudio 可以，而且我发现了一个很棒的 PA 包装器这里（它不是一个 gem，我认为因为它使用 ruby 的 ffi 附加到 PortAudio，并且 PA 库可能位于各个位置）。我一直在研究 PortAudio 的文档和示例来弄清楚 PA 的工作原理。我已经很多年没有编写或阅读过C了。

我遇到了困难，在创建过程中应该将哪些参数传递给流，以及在创建过程中应该传递给缓冲区。例如，帧到底是什么，以及它与通道和采样率等其他参数有何关系。一般来说，我对音频编程也是完全陌生的，所以如果有人能给我一些关于设备级音频的一般教程等，我将不胜感激。

ruby-portaudio 提供了一个示例，该示例创建一个流和一个缓冲区，将正弦波写入缓冲区，然后将缓冲区发送到要播放的流。我在示例中遇到的一些 ruby 问题，特别是循环块。

  PortAudio.init

  block_size = 1024
  sr   = 44100
  step = 1.0/sr
  time = 0.0

  stream = PortAudio::Stream.open(
             :sample_rate => sr,
             :frames => block_size,
             :output => {
               :device => PortAudio::Device.default_output,
               :channels => 1,
               :sample_format => :float32
              })

  buffer = PortAudio::SampleBuffer.new(
             :format   => :float32,
             :channels => 1,
             :frames   => block_size)

  playing = true
  Signal.trap('INT') { playing = false }
  puts "Ctrl-C to exit"

  stream.start

  loop do
    stream << buffer.fill { |frame, channel|
      time += step
      Math.cos(time * 2 * Math::PI * 440.0) * Math.cos(time * 2 * Math::PI)
    }

    break unless playing
  end

  stream.stop

如果我要录制，我应该将流读入缓冲区，然后操作该缓冲区并将其写入文件，对吗？

另外，如果我在这里咆哮错误的树，并且有一个更简单的方法来做到这一点（在红宝石中），一些方向会很好。

原文

I've been playing around with ruby recently, and I decided to start a simple project to write a ruby script that records line-in sound to a .wav file. I discovered that ruby doesn't provide very good access to hardware devices (and it probably shouldn't), but that PortAudio does, and I discovered a great wrapper for PA here (it is not a gem, I think because it uses ruby's ffi to attach to PortAudio, and the PA library could be in a variety of places). I've been muddling through PortAudio's documentation and examples to figure out how PA works. I haven't written or read C in years.

I'm running into difficulty with what parameters I should be passing to a stream during creation, and a buffer during creation. For example, what exactly is a frame, and how is it related to other parameters like channel and sample rate. I'm totally new to audio programming in general as well, so if anyone could point me to some general tutorials, etc, about device level audio, I'd appreciate it.

ruby-portaudio provides a single example that creates a stream and a buffer, writes a sin wave to the buffer, then sends the buffer to the stream to be played. Some of the ruby I'm having trouble with in the example, specifically the loop block.

  PortAudio.init

  block_size = 1024
  sr   = 44100
  step = 1.0/sr
  time = 0.0

  stream = PortAudio::Stream.open(
             :sample_rate => sr,
             :frames => block_size,
             :output => {
               :device => PortAudio::Device.default_output,
               :channels => 1,
               :sample_format => :float32
              })

  buffer = PortAudio::SampleBuffer.new(
             :format   => :float32,
             :channels => 1,
             :frames   => block_size)

  playing = true
  Signal.trap('INT') { playing = false }
  puts "Ctrl-C to exit"

  stream.start

  loop do
    stream << buffer.fill { |frame, channel|
      time += step
      Math.cos(time * 2 * Math::PI * 440.0) * Math.cos(time * 2 * Math::PI)
    }

    break unless playing
  end

  stream.stop

If I'm going to be recording, I should be reading a stream into a buffer, then manipulating that buffer and writing it to file, right?

Also, if I'm barking up the wrong tree here, and there is an easier way to do this (in ruby), some direction would be nice.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

吹梦到西洲 2025-01-09 13:05:57

让我们首先澄清您所询问的术语。为此，我将尝试以简化的方式解释音频管道。当您像示例中那样生成声音时，您的声卡会定期从代码中请求帧（=缓冲区=块），并用样本填充该帧。采样率定义了您在一秒钟内提供的样本数量，从而定义了样本的播放速度。帧大小（= 缓冲区大小 = 块大小）决定了声卡在一次请求中提供的样本数量。缓冲区通常非常小，因为缓冲区大小直接影响延迟（大缓冲区 => 高延迟）并且大数组可能很慢（尤其是 ruby 数组很慢）。

当您从声卡录制声音时，也会发生类似的情况。您的函数会时不时地被调用，并且来自麦克风的样本通常作为函数的参数传递（或者甚至只是对此类缓冲区的引用）。然后您需要处理这些样本，例如将它们写入磁盘。

我知道“一切都用 Ruby 来做”的想法非常诱人，因为它是一种如此美丽的语言。当您计划进行实时音频处理时，我建议切换到编译语言（C、C++、Obj-C，...）。它们可以更好地处理音频，因为它们比 Ruby 更接近硬件，因此通常速度更快，这在音频处理中可能是一个很大的问题。这可能也是 Ruby 音频库如此之少的原因，所以也许 Ruby 并不是适合这项工作的工具。

顺便说一句，我尝试了 ruby-portaudio、ffi-portaudio 以及 ruby-audio，但它们都无法在我的 Macbook 上正常工作（试图生成正弦波），这再次令人遗憾地表明，Ruby 无法处理这个东西（还没有？）。

回复收藏 0 原文

~没有更多了~