在 iOS 上录制、修改和播放音频

发布于 2024-10-19 15:27:34 字数 860 浏览 10 评论 0原文

编辑:最后,我完全按照下面的解释,使用 AVRecorder 来录制语音,使用 openAL 来进行音调转换和播放。效果很好。

我有一个关于录制、修改和播放音频的问题。我之前问过类似的问题(录制,修改在 iOS 上实时调节和播放音频),但我现在有了更多信息,可以提供一些进一步的建议。

所以首先,这就是我想要做的(在主线程的单独线程上):

  1. 监视 iphone 麦克风,
  2. 检查声音是否大于特定音量(
  3. 如果高于阈值),开始录音,例如,人开始说话,
  4. 继续录音,直到音量降至阈值以下例如,人停止说话
  5. 修改录制声音的音高。
  6. 播放声音

我正在考虑使用 AVRecorder 来监听和录制声音,这里有很好的教程:http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/

我正在考虑使用 openAL 来修改音调录制的音频。

所以我的问题是,我在上面列出的几点中的想法是否正确,我是否遗漏了某些内容,或者是否有更好/更简单的方法来做到这一点。我可以避免混合音频库而只使用 AVFoundation 来改变音高吗?

EDIT: In the end I used exactly as I explained below, AVRecorder for recording the speech and openAL for the pitch shift and playback. It worked out quite well.

I got a question regarding recording, modifying and playing back audio. I asked a similar question before ( Record, modify pitch and play back audio in real time on iOS ) but I now have more information and could do with some further advice please.

So firstly this is what I am trying to do (on a separate thread to the main thread):

  1. monitor the iphone mic
  2. check for sound greater than a certain volume
  3. if above threshold start recording e.g. person starts talking
  4. continue to record until volume drops below threshold e.g. person stops talking
  5. modify pitch of recorded sound.
  6. playback sound

I was thinking of using the AVRecorder to monitor and record the sound, good tutorial here: http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/

and I was thinking of using openAL to modify the pitch of the recorded audio.

So my question is, is my thinking correct in the list of points above, am I missing something or is there a better/easier way to do it. Can I avoid mixing audio libraries and just use AVFoundation to change the pitch too?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

合约呢 2024-10-26 15:27:34

您可以使用 AVRecorder 或诸如实时 IO 音频单元之类的较低版本。

“体积”的概念相当模糊。您可能想了解计算峰值和 RMS 值之间的差异,并了解如何在给定时间(例如 VU 表使用的 300 毫秒)内对 RMS 值进行积分。

基本上,您将所有值的平方相加。您可以使用 10 * log10f(sqrt(sum/num_samples)) 取平方根并转换为 dBFS,但您可以使用 20 * log10f(sum/num_samples) 一步完成,无需 sqrt。

您需要对积分时间和阈值进行大量调整,以使其按照您想要的方式运行。

对于音高变换,我认为 OpenAL 可以解决这个问题,其背后的技术称为频带限制插值 - https://ccrma.stanford.edu/~jos/resample/Theory_Ideal_Bandlimited_Interpolation.html

此示例将 rms 计算显示为运行平均值。循环缓冲区保留了平方的历史记录,并且无需在每次操作时对平方求和。我还没有运行它,所以将其视为伪代码;)

示例:

class VUMeter
{

protected:

    // samples per second
    float _sampleRate;

    // the integration time in seconds (vu meter is 300ms)
    float _integrationTime;

    // these maintain a circular buffer which contains
    // the 'squares' of the audio samples

    int _integrationBufferLength;
    float *_integrationBuffer;
    float *_integrationBufferEnd;
    float *_cursor;

    // this is a sort of accumulator to make a running
    // average more efficient

    float _sum;

public:

    VUMeter()
    : _sampleRate(48000.0f)
    , _integrationTime(0.3f)
    , _sum(0.)
    {
        // create a buffer of values to be integrated
        // e.g 300ms @ 48khz is 14400 samples

        _integrationBufferLength = (int) (_integrationTime * _sampleRate);

        _integrationBuffer = new float[_integrationBufferLength + 1];
        bzero(_integrationBuffer, _integrationBufferLength);

        // set the pointers for our ciruclar buffer

        _integrationBufferEnd = _integrationBuffer + _integrationBufferLength;
        _cursor = _integrationBuffer;

    }

    ~VUMeter()
    {
        delete _integrationBuffer;
    }

    float getRms(float *audio, int samples)
    {
        // process the samples
        // this part accumulates the 'squares'

        for (int i = 0; i < samples; ++i)
        {
            // get the input sample

            float s = audio[i];

            // remove the oldest value from the sum

            _sum -= *_cursor;

            // calculate the square and write it into the buffer

            double square = s * s;
            *_cursor = square;

            // add it to the sum

            _sum += square;

            // increment the buffer cursor and wrap

            ++_cursor;

            if (_cursor == _integrationBufferEnd)
                _cursor = _integrationBuffer;
        }

        // now calculate the 'root mean' value in db

        return 20 * log10f(_sum / _integrationBufferLength);
    }
};

You can either use AVRecorder or something lower like the realtime IO audio unit.

The concept of 'volume' is pretty vague. You might want to look at the difference between calculating peak and RMS values, and understanding how to integrate an RMS value over a given time (say 300ms which is what a VU meter uses).

Basically you sum all the squares of the values. You would take the square root and convert to dBFS with 10 * log10f(sqrt(sum/num_samples)), but you can do that without the sqrt in one step with 20 * log10f(sum/num_samples).

You'll need to do a lot of adjusting of integration times and thresholds to get it to behave the way you want.

For pitch shifting, I think OpenAL with do the trick, the technique behind it is called band limited interpolation - https://ccrma.stanford.edu/~jos/resample/Theory_Ideal_Bandlimited_Interpolation.html

This example shows a rms calculation as a running average. The circular buffer maintains a history of squares, and eliminates the need to sum the squares every operation. I haven't run it so treat it as pseudo code ;)

Example:

class VUMeter
{

protected:

    // samples per second
    float _sampleRate;

    // the integration time in seconds (vu meter is 300ms)
    float _integrationTime;

    // these maintain a circular buffer which contains
    // the 'squares' of the audio samples

    int _integrationBufferLength;
    float *_integrationBuffer;
    float *_integrationBufferEnd;
    float *_cursor;

    // this is a sort of accumulator to make a running
    // average more efficient

    float _sum;

public:

    VUMeter()
    : _sampleRate(48000.0f)
    , _integrationTime(0.3f)
    , _sum(0.)
    {
        // create a buffer of values to be integrated
        // e.g 300ms @ 48khz is 14400 samples

        _integrationBufferLength = (int) (_integrationTime * _sampleRate);

        _integrationBuffer = new float[_integrationBufferLength + 1];
        bzero(_integrationBuffer, _integrationBufferLength);

        // set the pointers for our ciruclar buffer

        _integrationBufferEnd = _integrationBuffer + _integrationBufferLength;
        _cursor = _integrationBuffer;

    }

    ~VUMeter()
    {
        delete _integrationBuffer;
    }

    float getRms(float *audio, int samples)
    {
        // process the samples
        // this part accumulates the 'squares'

        for (int i = 0; i < samples; ++i)
        {
            // get the input sample

            float s = audio[i];

            // remove the oldest value from the sum

            _sum -= *_cursor;

            // calculate the square and write it into the buffer

            double square = s * s;
            *_cursor = square;

            // add it to the sum

            _sum += square;

            // increment the buffer cursor and wrap

            ++_cursor;

            if (_cursor == _integrationBufferEnd)
                _cursor = _integrationBuffer;
        }

        // now calculate the 'root mean' value in db

        return 20 * log10f(_sum / _integrationBufferLength);
    }
};
谈场末日恋爱 2024-10-26 15:27:34

OpenAL 重采样将相反地改变音调和持续时间。例如,重新采样到更高音调的声音将播放更短的时间,从而更快。

OpenAL resampling will change the pitch and the duration inversely. e.g. a sound resampled to a higher pitch will play for a shorter amount of time and thus faster.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文