在 ruby​​ 中使用 PortAudio 包装器将声音录制到 .wav

发布于 2025-01-02 13:05:57 字数 1599 浏览 3 评论 0原文

我最近一直在使用 ruby​​,我决定开始一个简单的项目来编写一个 ruby​​ 脚本,将线路输入声音记录到 .wav 文件中。我发现 ruby​​ 不能很好地访问硬件设备(而且可能不应该),但 PortAudio 可以,而且我发现了一个很棒的 PA 包装器 这里(它不是一个 gem,我认为因为它使用 ruby​​ 的 ffi 附加到 PortAudio,并且 PA 库可能位于各个位置)。我一直在研究 PortAudio 的文档和示例来弄清楚 PA 的工作原理。我已经很多年没有编写或阅读过C了。

我遇到了困难,在创建过程中应该将哪些参数传递给流,以及在创建过程中应该传递给缓冲区。例如,到底是什么,以及它与通道采样率等其他参数有何关系。一般来说,我对音频编程也是完全陌生的,所以如果有人能给我一些关于设备级音频的一般教程等,我将不胜感激。

ruby-portaudio 提供了一个示例,该示例创建一个流和一个缓冲区,将正弦波写入缓冲区,然后将缓冲区发送到要播放的流。我在示例中遇到的一些 ruby​​ 问题,特别是循环块。

  PortAudio.init

  block_size = 1024
  sr   = 44100
  step = 1.0/sr
  time = 0.0

  stream = PortAudio::Stream.open(
             :sample_rate => sr,
             :frames => block_size,
             :output => {
               :device => PortAudio::Device.default_output,
               :channels => 1,
               :sample_format => :float32
              })

  buffer = PortAudio::SampleBuffer.new(
             :format   => :float32,
             :channels => 1,
             :frames   => block_size)

  playing = true
  Signal.trap('INT') { playing = false }
  puts "Ctrl-C to exit"

  stream.start

  loop do
    stream << buffer.fill { |frame, channel|
      time += step
      Math.cos(time * 2 * Math::PI * 440.0) * Math.cos(time * 2 * Math::PI)
    }

    break unless playing
  end

  stream.stop

如果我要录制,我应该将流读入缓冲区,然后操作该缓冲区并将其写入文件,对吗?

另外,如果我在这里咆哮错误的树,并且有一个更简单的方法来做到这一点(在红宝石中),一些方向会很好。

I've been playing around with ruby recently, and I decided to start a simple project to write a ruby script that records line-in sound to a .wav file. I discovered that ruby doesn't provide very good access to hardware devices (and it probably shouldn't), but that PortAudio does, and I discovered a great wrapper for PA here (it is not a gem, I think because it uses ruby's ffi to attach to PortAudio, and the PA library could be in a variety of places). I've been muddling through PortAudio's documentation and examples to figure out how PA works. I haven't written or read C in years.

I'm running into difficulty with what parameters I should be passing to a stream during creation, and a buffer during creation. For example, what exactly is a frame, and how is it related to other parameters like channel and sample rate. I'm totally new to audio programming in general as well, so if anyone could point me to some general tutorials, etc, about device level audio, I'd appreciate it.

ruby-portaudio provides a single example that creates a stream and a buffer, writes a sin wave to the buffer, then sends the buffer to the stream to be played. Some of the ruby I'm having trouble with in the example, specifically the loop block.

  PortAudio.init

  block_size = 1024
  sr   = 44100
  step = 1.0/sr
  time = 0.0

  stream = PortAudio::Stream.open(
             :sample_rate => sr,
             :frames => block_size,
             :output => {
               :device => PortAudio::Device.default_output,
               :channels => 1,
               :sample_format => :float32
              })

  buffer = PortAudio::SampleBuffer.new(
             :format   => :float32,
             :channels => 1,
             :frames   => block_size)

  playing = true
  Signal.trap('INT') { playing = false }
  puts "Ctrl-C to exit"

  stream.start

  loop do
    stream << buffer.fill { |frame, channel|
      time += step
      Math.cos(time * 2 * Math::PI * 440.0) * Math.cos(time * 2 * Math::PI)
    }

    break unless playing
  end

  stream.stop

If I'm going to be recording, I should be reading a stream into a buffer, then manipulating that buffer and writing it to file, right?

Also, if I'm barking up the wrong tree here, and there is an easier way to do this (in ruby), some direction would be nice.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

吹梦到西洲 2025-01-09 13:05:57

让我们首先澄清您所询问的术语。为此,我将尝试以简化的方式解释音频管道。当您像示例中那样生成声音时,您的声卡会定期从代码中请求帧(=缓冲区=块),并用样本填充该帧。采样率定义了您在一秒钟内提供的样本数量,从而定义了样本的播放速度。帧大小(= 缓冲区大小 = 块大小)决定了声卡在一次请求中提供的样本数量。缓冲区通常非常小,因为缓冲区大小直接影响延迟(大缓冲区 => 高延迟)并且大数组可能很慢(尤其是 ruby​​ 数组很慢)。

当您从声卡录制声音时,也会发生类似的情况。您的函数会时不时地被调用,并且来自麦克风的样本通常作为函数的参数传递(或者甚至只是对此类缓冲区的引用)。然后您需要处理这些样本,例如将它们写入磁盘。

我知道“一切都用 Ruby 来做”的想法非常诱人,因为它是一种如此美丽的语言。当您计划进行实时音频处理时,我建议切换到编译语言(C、C++、Obj-C,...)。它们可以更好地处理音频,因为它们比 Ruby 更接近硬件,因此通常速度更快,这在音频处理中可能是一个很大的问题。这可能也是 Ruby 音频库如此之少的原因,所以也许 Ruby 并不是适合这项工作的工具。

顺便说一句,我尝试了 ruby​​-portaudio、ffi-portaudio 以及 ruby​​-audio,但它们都无法在我的 Macbook 上正常工作(试图生成正弦波),这再次令人遗憾地表明,Ruby 无法处理这个东西(还没有?)。

Let's first clarify the terms you were asking about. For this purpose i will try to explain the audio pipeline in a simplified way. When you are generating a sound as in your example, your sound card periodically requests frames (= buffers = blocks) from your code, which you fill with your samples. The sampling rate defines how many samples you provide within a second and thus the speed with which your samples are played back. The frame size (= buffer size = block size) determines how many samples you provide in one request from the sound card. A buffer is typically quite small, because the buffer size directly affects the latency (large buffer => high latency) and large arrays can be slow (especially ruby arrays are slow).

Similar things happen when you are recording sound from your sound card. Your function gets called every now and then, and the samples from the microphone are typically passed in as an argument to the function (or even just a reference to such a buffer). You are then expected to process these samples, e.g. by writing them to disk.

I know that the thought of "doing everything in Ruby" is quite tempting, because it is such a beautiful language. When you are planning on doing audio processing in real time, i would recommend to switch to a compiled language (C, C++, Obj-C, ...) though. These can handle audio much better, because they're much closer to the hardware than Ruby and thus generally faster, which can be quite an issue in audio processing. This is probably also the reason why there are so few Ruby audio libraries around, so maybe Ruby just isn't the right tool for the job.

By the way, i tried out ruby-portaudio, ffi-portaudio as well as ruby-audio and none of them were working properly on my Macbook (tried to generate a sine wave) which sadly shows again, how Ruby is not capable of handling this stuff (yet?).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文