使用扩展音频文件服务将两个文件混合在一起
我正在使用音频单元进行一些自定义音频后处理。我有两个文件正在合并在一起(下面的链接),但在输出中出现了一些奇怪的噪音。我做错了什么?
我已经验证,在此步骤之前,2 个文件(workTrack1
和 workTrack2
)处于正确状态并且听起来不错。在此过程中也没有出现任何错误。
缓冲区处理代码:
- (BOOL)mixBuffersWithBuffer1:(const int16_t *)buffer1 buffer2:(const int16_t *)buffer2 outBuffer:(int16_t *)mixbuffer outBufferNumSamples:(int)mixbufferNumSamples {
BOOL clipping = NO;
for (int i = 0 ; i < mixbufferNumSamples; i++) {
int32_t s1 = buffer1[i];
int32_t s2 = buffer2[i];
int32_t mixed = s1 + s2;
if ((mixed < -32768) || (mixed > 32767)) {
clipping = YES; // don't break here because we dont want to lose data, only to warn the user
}
mixbuffer[i] = (int16_t) mixed;
}
return clipping;
}
混音代码:
////////////////////////////////////////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////// PHASE 4 ////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// In phase 4, open workTrack1 and workTrack2 for reading,
// mix together, and write out to outfile.
// open the outfile for writing -- this will erase the infile if they are the same, but its ok cause we are done with it
err = [self openExtAudioFileForWriting:outPath audioFileRefPtr:&outputAudioFileRef numChannels:numChannels];
if (err) { [self cleanupInBuffer1:inBuffer1 inBuffer2:inBuffer2 outBuffer:outBuffer err:err]; return NO; }
// setup vars
framesRead = 0;
totalFrames = [self totalFrames:mixAudioFile1Ref]; // the long one.
NSLog(@"Mix-down phase, %d frames (%0.2f secs)", totalFrames, totalFrames / RECORD_SAMPLES_PER_SECOND);
moreToProcess = YES;
while (moreToProcess) {
conversionBuffer1.mBuffers[0].mDataByteSize = LOOPER_BUFFER_SIZE;
conversionBuffer2.mBuffers[0].mDataByteSize = LOOPER_BUFFER_SIZE;
UInt32 frameCount1 = framesInBuffer;
UInt32 frameCount2 = framesInBuffer;
// Read a buffer of input samples up to AND INCLUDING totalFrames
int numFramesRemaining = totalFrames - framesRead; // Todo see if we are off by 1 here. Might have to add 1
if (numFramesRemaining == 0) {
moreToProcess = NO; // If no frames are to be read, then this phase is finished
} else {
if (numFramesRemaining < frameCount1) { // see if we are near the end
frameCount1 = numFramesRemaining;
frameCount2 = numFramesRemaining;
conversionBuffer1.mBuffers[0].mDataByteSize = (frameCount1 * bytesPerFrame);
conversionBuffer2.mBuffers[0].mDataByteSize = (frameCount2 * bytesPerFrame);
}
NSbugLog(@"Attempting to read %d frames from mixAudioFile1Ref", (int)frameCount1);
err = ExtAudioFileRead(mixAudioFile1Ref, &frameCount1, &conversionBuffer1);
if (err) { [self cleanupInBuffer1:inBuffer1 inBuffer2:inBuffer2 outBuffer:outBuffer err:err]; return NO; }
NSLog(@"Attempting to read %d frames from mixAudioFile2Ref", (int)frameCount2);
err = ExtAudioFileRead(mixAudioFile2Ref, &frameCount2, &conversionBuffer2);
if (err) { [self cleanupInBuffer1:inBuffer1 inBuffer2:inBuffer2 outBuffer:outBuffer err:err]; return NO; }
NSLog(@"Read %d frames from mixAudioFile1Ref in mix-down phase", (int)frameCount1);
NSLog(@"Read %d frames from mixAudioFile2Ref in mix-down phase", (int)frameCount2);
// If no frames were returned, phase is finished
if (frameCount1 == 0) {
moreToProcess = NO;
} else { // Process pcm data
// if buffer2 was not filled, fill with zeros
if (frameCount2 < frameCount1) {
bzero(inBuffer2 + frameCount2, (frameCount1 - frameCount2));
frameCount2 = frameCount1;
}
const int numSamples = (frameCount1 * bytesPerFrame) / sizeof(int16_t);
if ([self mixBuffersWithBuffer1:(const int16_t *)inBuffer1
buffer2:(const int16_t *)inBuffer2
outBuffer:(int16_t *)outBuffer
outBufferNumSamples:numSamples]) {
NSLog(@"Clipping");
}
// Write pcm data to the main output file
conversionOutBuffer.mBuffers[0].mDataByteSize = (frameCount1 * bytesPerFrame);
err = ExtAudioFileWrite(outputAudioFileRef, frameCount1, &conversionOutBuffer);
framesRead += frameCount1;
} // frame count
} // else
if (err) {
moreToProcess = NO;
}
} // while moreToProcess
// Check for errors
TTDASSERT(framesRead == totalFrames);
if (err) {
if (error) *error = [NSError errorWithDomain:kUAAudioSelfCrossFaderErrorDomain
code:UAAudioSelfCrossFaderErrorTypeMixDown
userInfo:[NSDictionary dictionaryWithObjectsAndKeys:[NSNumber numberWithInt:err],@"Underlying Error Code",[self commonExtAudioResultCode:err],@"Underlying Error Name",nil]];
[self cleanupInBuffer1:inBuffer1 inBuffer2:inBuffer2 outBuffer:outBuffer err:err];
return NO;
}
NSLog(@"Done with mix-down phase");
假设
-
mixAudioFile1Ref
始终比mixAudioFile2Ref
长 - 在
mixAudioFile2Ref
用完字节后,outputAudioFileRef
code> 听起来应该与 mixAudioFile2Ref 完全相同
预期的声音应该在开始时混合淡入和淡出,以便在曲目循环时产生自交叉淡入淡出。请听一下输出,看看代码,让我知道哪里出了问题。
源音:http://cl.ly/2g2F2A3k1r3S36210V23
产生的提示音:http://cl.ly/3q2w3S3Y0x0M3i2a1W3v
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
事实证明这里有两个问题。
缓冲区处理代码
int32_t mix = s1 + s2;
导致削波。更好的方法是除以混合通道数:int32_t mix = (s1 + s2)/2;
,然后在稍后的另一遍中进行归一化。帧!=字节
当声音耗尽时将第二个轨道的缓冲区清零时,我错误地将偏移量和持续时间设置为帧而不是字节。这会在缓冲区中产生垃圾并产生您定期听到的噪音。易于修复:
现在示例很棒: http://cl.ly/1E2q1L441s2b3e2X2z0J
Turns out there were two problems here.
Buffer Processing Code
int32_t mixed = s1 + s2;
was causing clipping. A better way is to divide by the number of channels mixed:int32_t mixed = (s1 + s2)/2;
then normalize in another pass later.Frames != bytes
When zeroing out the second track's buffers when the sound ran out, I was incorrectly setting the offset and duration as frames not bytes. This produced garbage in the buffer and created the noise you hear periodically. Easy to fix:
Now the sample is great: http://cl.ly/1E2q1L441s2b3e2X2z0J
您发布的答案看起来不错;我只能看到一个小问题。您的削波解决方案(除以二)会有所帮助,但它也相当于应用 50% 的增益减少。这与标准化不同。 归一化是查看整个音频文件、找到最高峰并应用给定的过程增益降低,使该峰值达到一定水平(通常为 0.0dB)。结果是在正常(即非削波)情况下,输出信号将非常低并且需要再次升压。
在混音过程中,您无疑遇到了导致失真的溢出,因为该值会环绕并导致信号跳跃。相反,您想要做的是应用一种名为“砖墙限制器的技术”,它基本上对裁剪的样本应用了硬上限。最简单的方法是:
这种技术的结果是,您会在削波的样本周围听到一点失真,但声音不会像整数溢出的情况那样完全被破坏。尽管存在失真,但不会破坏聆听体验。
Your posted answer looks good; I can only see one minor problem. Your solution for the clipping, dividing by two will help but it also is the equivalent of applying a 50% gain reduction. That is not the same as normalization; normalization is the process of looking through an entire audio file, finding the highest peak, and applying a given gain reduction so that this peak hits a certain level (usually 0.0dB). The result is that under normal (ie, non-clipping) circumstances, the output signal will be very low and need to be boosted again.
During your mixdown, you no doubt encountered an overflow which caused distortion, since the value would wrap around and cause a jump in the signal. What you want to do instead is to apply a technique called a "brick-wall limiter", which basically applies a hard ceiling to samples which are clipping. The simplest way to do this is:
The result of this technique is that you will hear a bit of distortion around samples which are clipping, but the sound will not be completely mangled as would be the case with integer overflow. The distortion, although present, doesn't destroy the listening experience.