音频处理:播放音量

发布于 2024-09-28 12:49:03 字数 1761 浏览 4 评论 0原文

我想从应用程序包中读取声音文件,复制它,使用其最大音量级别(增益值或峰值功率,我不确定它的技术名称)播放,然后将其作为另一个文件写入包中再次。

我做了抄写部分。生成的文件与输入文件相同。我使用 AudioToolbox 框架中 AudioFile 服务的 AudioFileReadBytes() 和 AudioFileWriteBytes() 函数来执行此操作。

因此,我有输入文件的字节及其音频数据格式(通过使用 AudioFileGetProperty() 和 kAudioFilePropertyDataFormat),但我找不到其中的变量来播放原始文件的最大音量级别。

为了澄清我的目的,我试图生成另一个声音文件,其音量相对于原始文件有所增加或减少,所以我不关心用户或 iOS 设置的系统音量。

这可能与我提到的框架有关吗?如果没有,有其他建议吗?

感谢


编辑: 浏览 Sam 关于一些音频基础知识的回答后,我决定用另一种选择来扩展这个问题。

我可以使用 AudioQueue 服务将现有声音文件(位于捆绑包中)录制到另一个文件并在录制阶段使用音量级别(在框架的帮助下)播放吗?


更新: 这是我读取输入文件和写入输出的方式。下面的代码降低了“某些”幅度值的声级,但产生了大量噪音。有趣的是,如果我选择 0.5 作为幅度值,它会增加而不是降低声音级别,但当我使用 0.1 作为幅度值时,它会降低声音。这两种情况都涉及令人不安的噪音。我认为这就是艺术谈论标准化的原因,但我对标准化一无所知。

AudioFileID inFileID;

CFURLRef inURL = [self inSoundURL];

AudioFileOpenURL(inURL, kAudioFileReadPermission, kAudioFileWAVEType, &inFileID)

UInt32 fileSize = [self audioFileSize:inFileID];
Float32 *inData = malloc(fileSize * sizeof(Float32)); //I used Float32 type with jv42's suggestion
AudioFileReadBytes(inFileID, false, 0, &fileSize, inData);

Float32 *outData = malloc(fileSize * sizeof(Float32));

//Art's suggestion, if I've correctly understood him

float ampScale = 0.5f; //this will reduce the 'volume' by -6db
for (int i = 0; i < fileSize; i++) {
    outData[i] = (Float32)(inData[i] * ampScale);
}

AudioStreamBasicDescription outDataFormat = {0};
[self audioDataFormat:inFileID];

AudioFileID outFileID;

CFURLRef outURL = [self outSoundURL];
AudioFileCreateWithURL(outURL, kAudioFileWAVEType, &outDataFormat, kAudioFileFlags_EraseFile, &outFileID)

AudioFileWriteBytes(outFileID, false, 0, &fileSize, outData);

AudioFileClose(outFileID);
AudioFileClose(inFileID);

I want to read a sound file from application bundle, copy it, play with its maximum volume level(Gain value or peak power, I'm not sure about the technical name of it), and then write it as another file to the bundle again.

I did the copying and writing part. Resulting file is identical to input file. I use AudioFileReadBytes() and AudioFileWriteBytes() functions of AudioFile services in AudioToolbox framework to do that.

So, I have the input file's bytes and also its audio data format(via use of AudioFileGetProperty() with kAudioFilePropertyDataFormat) but I can't find a variable in these to play with the original file's maximum volume level.

To clarify my purpose, I'm trying to produce another sound file of which volume level is increased or decreased relative to the original one, so I don't care about the system's volume level which is set by the user or iOS.

Is that possible to do with the framework I mentioned? If not, are there any alternative suggestions?

Thanks


edit:
Walking through Sam's answer regarding some audio basics, I decided to expand the question with another alternative.

Can I use AudioQueue services to record existing sound file(which is in the bundle) to another file and play with the volume level(with the help of framework) during the recording phase?


update:
Here's how I'm reading the input file and writing the output. Below code lowers the sound level for "some" of the amplitude values but with lots of noise. Interestingly, if I choose 0.5 as amplitude value it increases the sound level instead of lowering it, but when I use 0.1 as amplitude value it lowers the sound. Both cases involve disturbing noise. I think that's why Art is talking about normalization, but I've no idea about normalization.

AudioFileID inFileID;

CFURLRef inURL = [self inSoundURL];

AudioFileOpenURL(inURL, kAudioFileReadPermission, kAudioFileWAVEType, &inFileID)

UInt32 fileSize = [self audioFileSize:inFileID];
Float32 *inData = malloc(fileSize * sizeof(Float32)); //I used Float32 type with jv42's suggestion
AudioFileReadBytes(inFileID, false, 0, &fileSize, inData);

Float32 *outData = malloc(fileSize * sizeof(Float32));

//Art's suggestion, if I've correctly understood him

float ampScale = 0.5f; //this will reduce the 'volume' by -6db
for (int i = 0; i < fileSize; i++) {
    outData[i] = (Float32)(inData[i] * ampScale);
}

AudioStreamBasicDescription outDataFormat = {0};
[self audioDataFormat:inFileID];

AudioFileID outFileID;

CFURLRef outURL = [self outSoundURL];
AudioFileCreateWithURL(outURL, kAudioFileWAVEType, &outDataFormat, kAudioFileFlags_EraseFile, &outFileID)

AudioFileWriteBytes(outFileID, false, 0, &fileSize, outData);

AudioFileClose(outFileID);
AudioFileClose(inFileID);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

难忘№最初的完美 2024-10-05 12:49:03

您不会在 (Ext)AudioFile 中找到幅度缩放操作,因为它是您可以执行的最简单的 DSP。

假设您使用 ExtAudioFile 将读取的任何内容转换为 32 位浮点数。要改变幅度,只需乘以:

float ampScale = 0.5f; //this will reduce the 'volume' by -6db
for (int ii=0; ii<numSamples; ++ii) {
    *sampOut = *sampIn * ampScale;
    sampOut++; sampIn++;
}

要增加增益,只需使用比例 > 1.f.例如,2.f 的 ampScale 将为您提供 +6dB 的增益。

如果要标准化,则必须对音频进行两次传递:一次确定振幅最大的样本。然后另一个实际应用您计算的增益。

使用 AudioQueue 服务只是为了访问音量属性是严重的,严重的矫枉过正。

更新:

在更新的代码中,您将每个字节乘以0.5,而不是每个样本。这是对您的代码的快速修复,但请参阅下面我的注释。我不会做你正在做的事。

...

// create short pointers to our byte data
int16_t *inDataShort = (int16_t *)inData;
int16_t *outDataShort = (int16_t *)inData;

int16_t ampScale = 2;
for (int i = 0; i < fileSize; i++) {
    outDataShort[i] = inDataShort[i] / ampScale;
}

...

当然,这不是最好的方法:它假设您的文件是小端 16 位有符号线性 PCM。 (大多数 WAV 文件是,但不是 AIFF、m4a、mp3 等)我会使用 ExtAudioFile API 而不是 AudioFile API,因为这会将您正在读取的任何格式转换为您想要在代码中使用的任何格式。通常最简单的做法是以 32 位浮点形式读取样本。以下是使用 ExtAudioAPI 处理任何输入文件格式(包括立体声和单声道)的代码示例

void ScaleAudioFileAmplitude(NSURL *theURL, float ampScale) {
    OSStatus err = noErr;

    ExtAudioFileRef audiofile;
    ExtAudioFileOpenURL((CFURLRef)theURL, &audiofile);
    assert(audiofile);

    // get some info about the file's format.
    AudioStreamBasicDescription fileFormat;
    UInt32 size = sizeof(fileFormat);
    err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_FileDataFormat, &size, &fileFormat);

    // we'll need to know what type of file it is later when we write 
    AudioFileID aFile;
    size = sizeof(aFile);
    err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_AudioFile, &size, &aFile);
    AudioFileTypeID fileType;
    size = sizeof(fileType);
    err = AudioFileGetProperty(aFile, kAudioFilePropertyFileFormat, &size, &fileType);


    // tell the ExtAudioFile API what format we want samples back in
    AudioStreamBasicDescription clientFormat;
    bzero(&clientFormat, sizeof(clientFormat));
    clientFormat.mChannelsPerFrame = fileFormat.mChannelsPerFrame;
    clientFormat.mBytesPerFrame = 4;
    clientFormat.mBytesPerPacket = clientFormat.mBytesPerFrame;
    clientFormat.mFramesPerPacket = 1;
    clientFormat.mBitsPerChannel = 32;
    clientFormat.mFormatID = kAudioFormatLinearPCM;
    clientFormat.mSampleRate = fileFormat.mSampleRate;
    clientFormat.mFormatFlags = kLinearPCMFormatFlagIsFloat | kAudioFormatFlagIsNonInterleaved;
    err = ExtAudioFileSetProperty(audiofile, kExtAudioFileProperty_ClientDataFormat, sizeof(clientFormat), &clientFormat);

    // find out how many frames we need to read
    SInt64 numFrames = 0;
    size = sizeof(numFrames);
    err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_FileLengthFrames, &size, &numFrames);

    // create the buffers for reading in data
    AudioBufferList *bufferList = malloc(sizeof(AudioBufferList) + sizeof(AudioBuffer) * (clientFormat.mChannelsPerFrame - 1));
    bufferList->mNumberBuffers = clientFormat.mChannelsPerFrame;
    for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
        bufferList->mBuffers[ii].mDataByteSize = sizeof(float) * numFrames;
        bufferList->mBuffers[ii].mNumberChannels = 1;
        bufferList->mBuffers[ii].mData = malloc(bufferList->mBuffers[ii].mDataByteSize);
    }

    // read in the data
    UInt32 rFrames = (UInt32)numFrames;
    err = ExtAudioFileRead(audiofile, &rFrames, bufferList);

    // close the file
    err = ExtAudioFileDispose(audiofile);

    // process the audio
    for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
        float *fBuf = (float *)bufferList->mBuffers[ii].mData;
        for (int jj=0; jj < rFrames; ++jj) {
            *fBuf = *fBuf * ampScale;
            fBuf++;
        }
    }

    // open the file for writing
    err = ExtAudioFileCreateWithURL((CFURLRef)theURL, fileType, &fileFormat, NULL, kAudioFileFlags_EraseFile, &audiofile);

    // tell the ExtAudioFile API what format we'll be sending samples in
    err = ExtAudioFileSetProperty(audiofile, kExtAudioFileProperty_ClientDataFormat, sizeof(clientFormat), &clientFormat);

    // write the data
    err = ExtAudioFileWrite(audiofile, rFrames, bufferList);

    // close the file
    ExtAudioFileDispose(audiofile);

    // destroy the buffers
    for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
        free(bufferList->mBuffers[ii].mData);
    }
    free(bufferList);
    bufferList = NULL;

}

You won't find amplitude scaling operations in (Ext)AudioFile, because it's about the simplest DSP you can do.

Let's assume you use ExtAudioFile to convert whatever you read into 32-bit floats. To change the amplitude, you simply multiply:

float ampScale = 0.5f; //this will reduce the 'volume' by -6db
for (int ii=0; ii<numSamples; ++ii) {
    *sampOut = *sampIn * ampScale;
    sampOut++; sampIn++;
}

To increase the gain, you simply use a scale > 1.f. For example, an ampScale of 2.f would give you +6dB of gain.

If you want to normalize, you have to make two passes over the audio: One to determine the sample with the greatest amplitude. Then another to actually apply your computed gain.

Using AudioQueue services just to get access to the volume property is serious, serious overkill.

UPDATE:

In your updated code, you're multiplying each byte by 0.5 instead of each sample. Here's a quick-and-dirty fix for your code, but see my notes below. I wouldn't do what you're doing.

...

// create short pointers to our byte data
int16_t *inDataShort = (int16_t *)inData;
int16_t *outDataShort = (int16_t *)inData;

int16_t ampScale = 2;
for (int i = 0; i < fileSize; i++) {
    outDataShort[i] = inDataShort[i] / ampScale;
}

...

Of course, this isn't the best way to do things: It assumes your file is little-endian 16-bit signed linear PCM. (Most WAV files are, but not AIFF, m4a, mp3, etc.) I'd use the ExtAudioFile API instead of the AudioFile API as this will convert any format you're reading into whatever format you want to work with in code. Usually the simplest thing to do is read your samples in as 32-bit float. Here's an example of your code using ExtAudioAPI to handle any input file format, including stereo v. mono

void ScaleAudioFileAmplitude(NSURL *theURL, float ampScale) {
    OSStatus err = noErr;

    ExtAudioFileRef audiofile;
    ExtAudioFileOpenURL((CFURLRef)theURL, &audiofile);
    assert(audiofile);

    // get some info about the file's format.
    AudioStreamBasicDescription fileFormat;
    UInt32 size = sizeof(fileFormat);
    err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_FileDataFormat, &size, &fileFormat);

    // we'll need to know what type of file it is later when we write 
    AudioFileID aFile;
    size = sizeof(aFile);
    err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_AudioFile, &size, &aFile);
    AudioFileTypeID fileType;
    size = sizeof(fileType);
    err = AudioFileGetProperty(aFile, kAudioFilePropertyFileFormat, &size, &fileType);


    // tell the ExtAudioFile API what format we want samples back in
    AudioStreamBasicDescription clientFormat;
    bzero(&clientFormat, sizeof(clientFormat));
    clientFormat.mChannelsPerFrame = fileFormat.mChannelsPerFrame;
    clientFormat.mBytesPerFrame = 4;
    clientFormat.mBytesPerPacket = clientFormat.mBytesPerFrame;
    clientFormat.mFramesPerPacket = 1;
    clientFormat.mBitsPerChannel = 32;
    clientFormat.mFormatID = kAudioFormatLinearPCM;
    clientFormat.mSampleRate = fileFormat.mSampleRate;
    clientFormat.mFormatFlags = kLinearPCMFormatFlagIsFloat | kAudioFormatFlagIsNonInterleaved;
    err = ExtAudioFileSetProperty(audiofile, kExtAudioFileProperty_ClientDataFormat, sizeof(clientFormat), &clientFormat);

    // find out how many frames we need to read
    SInt64 numFrames = 0;
    size = sizeof(numFrames);
    err = ExtAudioFileGetProperty(audiofile, kExtAudioFileProperty_FileLengthFrames, &size, &numFrames);

    // create the buffers for reading in data
    AudioBufferList *bufferList = malloc(sizeof(AudioBufferList) + sizeof(AudioBuffer) * (clientFormat.mChannelsPerFrame - 1));
    bufferList->mNumberBuffers = clientFormat.mChannelsPerFrame;
    for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
        bufferList->mBuffers[ii].mDataByteSize = sizeof(float) * numFrames;
        bufferList->mBuffers[ii].mNumberChannels = 1;
        bufferList->mBuffers[ii].mData = malloc(bufferList->mBuffers[ii].mDataByteSize);
    }

    // read in the data
    UInt32 rFrames = (UInt32)numFrames;
    err = ExtAudioFileRead(audiofile, &rFrames, bufferList);

    // close the file
    err = ExtAudioFileDispose(audiofile);

    // process the audio
    for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
        float *fBuf = (float *)bufferList->mBuffers[ii].mData;
        for (int jj=0; jj < rFrames; ++jj) {
            *fBuf = *fBuf * ampScale;
            fBuf++;
        }
    }

    // open the file for writing
    err = ExtAudioFileCreateWithURL((CFURLRef)theURL, fileType, &fileFormat, NULL, kAudioFileFlags_EraseFile, &audiofile);

    // tell the ExtAudioFile API what format we'll be sending samples in
    err = ExtAudioFileSetProperty(audiofile, kExtAudioFileProperty_ClientDataFormat, sizeof(clientFormat), &clientFormat);

    // write the data
    err = ExtAudioFileWrite(audiofile, rFrames, bufferList);

    // close the file
    ExtAudioFileDispose(audiofile);

    // destroy the buffers
    for (int ii=0; ii < bufferList->mNumberBuffers; ++ii) {
        free(bufferList->mBuffers[ii].mData);
    }
    free(bufferList);
    bufferList = NULL;

}
从来不烧饼 2024-10-05 12:49:03

我认为如果可以的话,您应该避免使用 8 位无符号字符来处理音频。
尝试获取 16 位或 32 位数据,这样可以避免一些噪声/质量差问题。

I think you should avoid working with 8 bits unsigned chars for audio, if you can.
Try to get the data as 16 bits or 32 bits, that would avoid some noise/bad quality issues.

人心善变 2024-10-05 12:49:03

对于大多数常见的音频文件格式,没有单一的主音量变量。相反,您需要获取(或转换为)PCM 声音样本,并对每个样本至少执行一些最小的数字信号处理(乘法、饱和/限制/AGC、量化噪声整形等)。

For most common audio file formats there isn't a single master volume variable. Instead you will need to take (or convert to) the PCM sound samples and perform at least some minimal digital signal processing (multiply, saturate/limit/AGC, quantization noise shaping, and etc.) on each sample.

小矜持 2024-10-05 12:49:03

如果声音文件已标准化,则无法使文件声音变大。除了音频编码不良的情况外,音量几乎完全是播放引擎的领域。

正确存储

的音频文件将具有峰值音量等于或接近文件位深度可用的最大值。如果您尝试“降低声音文件的音量”,实际上只会降低音质。

If the sound file is normalized, there's nothing you can do to make the file louder. Except in the case of poorly encoded audio, volume is almost entirely the realm of the playback engine.

http://en.wikipedia.org/wiki/Audio_bit_depth

Properly stored audio files will have peak volume at or near the maximum value available for the file's bit depth. If you attempt to 'decrease the volume' of a sound file, you'll essentially just be degrading the sound quality.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文