从线性 PCM 中提取音频通道

发布于 2024-10-10 14:46:08 字数 5740 浏览 5 评论 0原文

我想从 LPCM 原始文件中提取通道音频,即提取立体声 LPCM 文件的左通道和右通道。 LPCM 是 16 位深度、交错、2 通道、小字节序。根据我收集的信息,字节顺序是 {LeftChannel,RightChannel,LeftChannel,RightChannel...} 并且由于它是 16 位深度,因此每个通道将有 2 个字节的样本,对吧?

所以我的问题是,如果我想提取左通道,那么我会获取 0,2,4,6...n*2 地址中的字节吗?而右通道为 1,3,4,...(n*2+1)。

另外,提取音频通道后,我应该将提取通道的格式设置为 16 位深度,1 通道吗?

预先感谢

这是我目前用来从 AssetReader 中提取 PCM 音频的代码。该代码可以很好地编写音乐文件而无需提取其通道,所以我可能是由格式或其他原因引起的...

    NSURL *assetURL = [song valueForProperty:MPMediaItemPropertyAssetURL];
AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:assetURL options:nil];
NSDictionary *outputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
                                [NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey, 
                                [NSNumber numberWithFloat:44100.0], AVSampleRateKey,
                                [NSNumber numberWithInt:2], AVNumberOfChannelsKey,
                            //  [NSData dataWithBytes:&channelLayout length:sizeof(AudioChannelLayout)], AVChannelLayoutKey,
                                [NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
                                [NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
                                [NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
                                [NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
                                nil];
NSError *assetError = nil;
AVAssetReader *assetReader = [[AVAssetReader assetReaderWithAsset:songAsset
                                                            error:&assetError]
                              retain];
if (assetError) {
    NSLog (@"error: %@", assetError);
    return;
}

AVAssetReaderOutput *assetReaderOutput = [[AVAssetReaderAudioMixOutput 
                                           assetReaderAudioMixOutputWithAudioTracks:songAsset.tracks
                                           audioSettings: outputSettings]
                                          retain];
if (! [assetReader canAddOutput: assetReaderOutput]) {
    NSLog (@"can't add reader output... die!");
    return;
}
[assetReader addOutput: assetReaderOutput];


NSArray *dirs = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentsDirectoryPath = [dirs objectAtIndex:0];

//CODE TO SPLIT STEREO
[self setupAudioWithFormatMono:kAudioFormatLinearPCM];
NSString *splitExportPath = [[documentsDirectoryPath stringByAppendingPathComponent:@"monoleft.caf"] retain];
if ([[NSFileManager defaultManager] fileExistsAtPath:splitExportPath]) {
    [[NSFileManager defaultManager] removeItemAtPath:splitExportPath error:nil];
}

AudioFileID mRecordFile;
NSURL *splitExportURL = [NSURL fileURLWithPath:splitExportPath];


OSStatus status =  AudioFileCreateWithURL(splitExportURL, kAudioFileCAFType, &_streamFormat, kAudioFileFlags_EraseFile,
                                          &mRecordFile);

NSLog(@"status os %d",status);

[assetReader startReading];

CMSampleBufferRef sampBuffer = [assetReaderOutput copyNextSampleBuffer];
UInt32 countsamp= CMSampleBufferGetNumSamples(sampBuffer);
NSLog(@"number of samples %d",countsamp);

SInt64 countByteBuf = 0;
SInt64 countPacketBuf = 0;
UInt32 numBytesIO = 0;
UInt32 numPacketsIO = 0;
NSMutableData * bufferMono = [NSMutableData new];
while (sampBuffer) {


    AudioBufferList  audioBufferList;
    CMBlockBufferRef blockBuffer;
    CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampBuffer, NULL, &audioBufferList, sizeof(audioBufferList), NULL, NULL, 0, &blockBuffer);
    for (int y=0; y<audioBufferList.mNumberBuffers; y++) {
        AudioBuffer audioBuffer = audioBufferList.mBuffers[y];
        //frames = audioBuffer.mData;
        NSLog(@"the number of channel for buffer number %d is %d",y,audioBuffer.mNumberChannels);
        NSLog(@"The buffer size is %d",audioBuffer.mDataByteSize);






        //Append mono left to buffer data
        for (int i=0; i<audioBuffer.mDataByteSize; i= i+4) {
            [bufferMono appendBytes:(audioBuffer.mData+i) length:2];
        }

        //the number of bytes in the mutable data containing mono audio file
        numBytesIO = [bufferMono length];
        numPacketsIO = numBytesIO/2;
        NSLog(@"numpacketsIO %d",numPacketsIO);
        status = AudioFileWritePackets(mRecordFile, NO, numBytesIO, &_packetFormat, countPacketBuf, &numPacketsIO, audioBuffer.mData);
        NSLog(@"status for writebyte %d, packets written %d",status,numPacketsIO);
        if(numPacketsIO != (numBytesIO/2)){
            NSLog(@"Something wrong");
            assert(0);
        }


        countPacketBuf = countPacketBuf + numPacketsIO;
        [bufferMono setLength:0];


    }

    sampBuffer = [assetReaderOutput copyNextSampleBuffer];
    countsamp= CMSampleBufferGetNumSamples(sampBuffer);
    NSLog(@"number of samples %d",countsamp);
}
AudioFileClose(mRecordFile);
[assetReader cancelReading];
[self performSelectorOnMainThread:@selector(updateCompletedSizeLabel:)
                       withObject:0
                    waitUntilDone:NO];

输出格式与音频文件服务如下:

        _streamFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
    _streamFormat.mBitsPerChannel = 16;
    _streamFormat.mChannelsPerFrame = 1;
    _streamFormat.mBytesPerPacket = 2;
    _streamFormat.mBytesPerFrame = 2;// (_streamFormat.mBitsPerChannel / 8) * _streamFormat.mChannelsPerFrame;
    _streamFormat.mFramesPerPacket = 1;
    _streamFormat.mSampleRate = 44100.0;

    _packetFormat.mStartOffset = 0;
    _packetFormat.mVariableFramesInPacket = 0;
    _packetFormat.mDataByteSize = 2;

I would like to extract a channel audio from the an LPCM raw file ie extract left and right channel of a stereo LPCM file. The LPCM is 16 bit depth,interleaved, 2 channels,litle endian. From what I gather the order of byte is {LeftChannel,RightChannel,LeftChannel,RightChannel...} and since it is 16 bit depth there will be 2 bytes of sample for each channel right?

So my question is if i want to extract the left channel then I would take the bytes in 0,2,4,6...n*2 address? while the right channel would be 1,3,4,...(n*2+1).

Also after extracting the audio channel, should i set the format of the extracted channel as 16 bit depth ,1 channel?

Thanks in advance

This is the code that I currently use to extract PCM audio from AssetReader.. This code works fine with writing a music file without its channel being extracted so I it might be caused by the format or something...

    NSURL *assetURL = [song valueForProperty:MPMediaItemPropertyAssetURL];
AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:assetURL options:nil];
NSDictionary *outputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
                                [NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey, 
                                [NSNumber numberWithFloat:44100.0], AVSampleRateKey,
                                [NSNumber numberWithInt:2], AVNumberOfChannelsKey,
                            //  [NSData dataWithBytes:&channelLayout length:sizeof(AudioChannelLayout)], AVChannelLayoutKey,
                                [NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
                                [NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
                                [NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
                                [NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
                                nil];
NSError *assetError = nil;
AVAssetReader *assetReader = [[AVAssetReader assetReaderWithAsset:songAsset
                                                            error:&assetError]
                              retain];
if (assetError) {
    NSLog (@"error: %@", assetError);
    return;
}

AVAssetReaderOutput *assetReaderOutput = [[AVAssetReaderAudioMixOutput 
                                           assetReaderAudioMixOutputWithAudioTracks:songAsset.tracks
                                           audioSettings: outputSettings]
                                          retain];
if (! [assetReader canAddOutput: assetReaderOutput]) {
    NSLog (@"can't add reader output... die!");
    return;
}
[assetReader addOutput: assetReaderOutput];


NSArray *dirs = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentsDirectoryPath = [dirs objectAtIndex:0];

//CODE TO SPLIT STEREO
[self setupAudioWithFormatMono:kAudioFormatLinearPCM];
NSString *splitExportPath = [[documentsDirectoryPath stringByAppendingPathComponent:@"monoleft.caf"] retain];
if ([[NSFileManager defaultManager] fileExistsAtPath:splitExportPath]) {
    [[NSFileManager defaultManager] removeItemAtPath:splitExportPath error:nil];
}

AudioFileID mRecordFile;
NSURL *splitExportURL = [NSURL fileURLWithPath:splitExportPath];


OSStatus status =  AudioFileCreateWithURL(splitExportURL, kAudioFileCAFType, &_streamFormat, kAudioFileFlags_EraseFile,
                                          &mRecordFile);

NSLog(@"status os %d",status);

[assetReader startReading];

CMSampleBufferRef sampBuffer = [assetReaderOutput copyNextSampleBuffer];
UInt32 countsamp= CMSampleBufferGetNumSamples(sampBuffer);
NSLog(@"number of samples %d",countsamp);

SInt64 countByteBuf = 0;
SInt64 countPacketBuf = 0;
UInt32 numBytesIO = 0;
UInt32 numPacketsIO = 0;
NSMutableData * bufferMono = [NSMutableData new];
while (sampBuffer) {


    AudioBufferList  audioBufferList;
    CMBlockBufferRef blockBuffer;
    CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampBuffer, NULL, &audioBufferList, sizeof(audioBufferList), NULL, NULL, 0, &blockBuffer);
    for (int y=0; y<audioBufferList.mNumberBuffers; y++) {
        AudioBuffer audioBuffer = audioBufferList.mBuffers[y];
        //frames = audioBuffer.mData;
        NSLog(@"the number of channel for buffer number %d is %d",y,audioBuffer.mNumberChannels);
        NSLog(@"The buffer size is %d",audioBuffer.mDataByteSize);






        //Append mono left to buffer data
        for (int i=0; i<audioBuffer.mDataByteSize; i= i+4) {
            [bufferMono appendBytes:(audioBuffer.mData+i) length:2];
        }

        //the number of bytes in the mutable data containing mono audio file
        numBytesIO = [bufferMono length];
        numPacketsIO = numBytesIO/2;
        NSLog(@"numpacketsIO %d",numPacketsIO);
        status = AudioFileWritePackets(mRecordFile, NO, numBytesIO, &_packetFormat, countPacketBuf, &numPacketsIO, audioBuffer.mData);
        NSLog(@"status for writebyte %d, packets written %d",status,numPacketsIO);
        if(numPacketsIO != (numBytesIO/2)){
            NSLog(@"Something wrong");
            assert(0);
        }


        countPacketBuf = countPacketBuf + numPacketsIO;
        [bufferMono setLength:0];


    }

    sampBuffer = [assetReaderOutput copyNextSampleBuffer];
    countsamp= CMSampleBufferGetNumSamples(sampBuffer);
    NSLog(@"number of samples %d",countsamp);
}
AudioFileClose(mRecordFile);
[assetReader cancelReading];
[self performSelectorOnMainThread:@selector(updateCompletedSizeLabel:)
                       withObject:0
                    waitUntilDone:NO];

The output format with audiofileservices is as follows:

        _streamFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
    _streamFormat.mBitsPerChannel = 16;
    _streamFormat.mChannelsPerFrame = 1;
    _streamFormat.mBytesPerPacket = 2;
    _streamFormat.mBytesPerFrame = 2;// (_streamFormat.mBitsPerChannel / 8) * _streamFormat.mChannelsPerFrame;
    _streamFormat.mFramesPerPacket = 1;
    _streamFormat.mSampleRate = 44100.0;

    _packetFormat.mStartOffset = 0;
    _packetFormat.mVariableFramesInPacket = 0;
    _packetFormat.mDataByteSize = 2;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

对你再特殊 2024-10-17 14:46:08

听起来几乎正确 - 您有 16 位深度,因此这意味着每个样本将占用 2 个字节。这意味着左声道数据将以字节 {0,1}、{4,5}、{8,9} 等为单位。交错意味着样本是交错的,而不是字节。
除此之外,我会尝试一下,看看您的代码是否有任何问题。

提取音频后
频道,我应该设置格式吗
提取的通道为 16 位深度
,1 个通道?

提取后,两个通道中仅剩下一个,所以是的,这是正确的。

Sounds almost right - you have a 16 bit depth, so that means each sample will take 2 bytes. That means the left channel data will be in bytes {0,1}, {4,5}, {8,9} and so on. Interleaved means the samples are interleaved, not the bytes.
Other than that I would try it out and see if you have any problems with your code.

Also after extracting the audio
channel, should i set the format of
the extracted channel as 16 bit depth
,1 channel?

Only one of the two channels is remaining after your extraction, so yes, this is correct.

青衫负雪 2024-10-17 14:46:08

我遇到了类似的错误,即音频听起来“慢”,原因是您将 mChannelsPerFrame 指定为 1,而您有双通道声音。将其设置为 2 应该会加快播放速度。还要告诉你执行此操作后输出“声音”是否正确......:)

I had a similar error that the audio sounded 'slow', the reason for this is that you specified mChannelsPerFrame of 1, whereas you have a dual channel sound. Set it to 2 and it should speed up the playback. Also do tell if after you do this the output 'sounds' correctly... :)

肩上的翅膀 2024-10-17 14:46:08

我正在尝试将立体声音频拆分为两个单声道文件(拆分立体声iOS 上的音频到单声道流)。我一直在使用你的代码,但似乎无法让它工作。 setupAudioWithFormatMono 方法的内容是什么?

I'm trying to split my stereo audio into two mono files (split stereo audio to mono streams on iOS). I've been using your code but can't seem to get it to work. Whats the contents of your setupAudioWithFormatMono method?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文