如何使用 AVAssetReader 在 iOS 上正确读取解码的 PCM 样本 - 当前解码不正确
作为计算机科学学士学位的一部分,我目前正在开发一个应用程序。该应用程序会将来自 iPhone 硬件(加速计、GPS)的数据与正在播放的音乐关联起来。
该项目仍处于起步阶段,只花了两个月的时间。
我现在需要帮助的地方是从 itunes 库中的歌曲中读取 PCM 样本,并使用音频单元播放它们。 目前我想要的实现执行以下操作:从 iTunes 中选择一首随机歌曲,并在需要时从中读取样本,然后存储在缓冲区中,我们将其称为“sampleBuffer”。随后,在消费者模型中,音频单元(具有混音器和 RemoteIO 输出)有一个回调,我只需将所需数量的样本从 SampleBuffer 复制到回调中指定的缓冲区中即可。然后我通过扬声器听到的内容并不完全是我所期望的;我可以识别出它正在播放歌曲,但似乎解码不正确并且有很多噪音!我附上了一张图像,显示了前半秒(24576 个样本@ 44.1kHz),这与正常的输出不同。 在进入列表之前,我检查了文件是否未损坏,类似地,我已经为缓冲区编写了测试用例(所以我知道缓冲区不会改变样本),尽管这可能不是最好的方法(有些人会争论要走音频队列路线),我想对样本执行各种操作以及在完成之前更改歌曲,重新安排播放的歌曲等。此外,音频中可能有一些不正确的设置单位,但是,显示样本的图表(这表明样本解码不正确)是直接从缓冲区获取的,因此我现在只想解决为什么从磁盘读取和解码无法正常工作的问题。现在我只想通过工作来玩。 无法发布图像,因为刚接触 stackoverflow,因此这里是图像的链接: https://i.sstatic.net/ RHjlv.jpg
清单:
这是我设置用于 AVAssetReaderAudioMixOutput 的 audioReadSettigns 的地方
// Set the read settings
audioReadSettings = [[NSMutableDictionary alloc] init];
[audioReadSettings setValue:[NSNumber numberWithInt:kAudioFormatLinearPCM]
forKey:AVFormatIDKey];
[audioReadSettings setValue:[NSNumber numberWithInt:16] forKey:AVLinearPCMBitDepthKey];
[audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsBigEndianKey];
[audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsFloatKey];
[audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsNonInterleaved];
[audioReadSettings setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey];
现在,以下代码清单是一个接收 NSString 的方法,其 persistant_id 为歌曲:
-(BOOL)setNextSongID:(NSString*)persistand_id {
assert(persistand_id != nil);
MPMediaItem *song = [self getMediaItemForPersistantID:persistand_id];
NSURL *assetUrl = [song valueForProperty:MPMediaItemPropertyAssetURL];
AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:assetUrl
options:[NSDictionary dictionaryWithObject:[NSNumber numberWithBool:YES]
forKey:AVURLAssetPreferPreciseDurationAndTimingKey]];
NSError *assetError = nil;
assetReader = [[AVAssetReader assetReaderWithAsset:songAsset error:&assetError] retain];
if (assetError) {
NSLog(@"error: %@", assetError);
return NO;
}
CMTimeRange timeRange = CMTimeRangeMake(kCMTimeZero, songAsset.duration);
[assetReader setTimeRange:timeRange];
track = [[songAsset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0];
assetReaderOutput = [AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:[NSArray arrayWithObject:track]
audioSettings:audioReadSettings];
if (![assetReader canAddOutput:assetReaderOutput]) {
NSLog(@"cant add reader output... die!");
return NO;
}
[assetReader addOutput:assetReaderOutput];
[assetReader startReading];
// just getting some basic information about the track to print
NSArray *formatDesc = ((AVAssetTrack*)[[assetReaderOutput audioTracks] objectAtIndex:0]).formatDescriptions;
for (unsigned int i = 0; i < [formatDesc count]; ++i) {
CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
const CAStreamBasicDescription *asDesc = (CAStreamBasicDescription*)CMAudioFormatDescriptionGetStreamBasicDescription(item);
if (asDesc) {
// get data
numChannels = asDesc->mChannelsPerFrame;
sampleRate = asDesc->mSampleRate;
asDesc->Print();
}
}
[self copyEnoughSamplesToBufferForLength:24000];
return YES;
}
下面介绍了函数 -(void)copyEnoughSamplesToBufferForLength:
-(void)copyEnoughSamplesToBufferForLength:(UInt32)samples_count {
[w_lock lock];
int stillToCopy = 0;
if (sampleBuffer->numSamples() < samples_count) {
stillToCopy = samples_count;
}
NSAutoreleasePool *apool = [[NSAutoreleasePool alloc] init];
CMSampleBufferRef sampleBufferRef;
SInt16 *dataBuffer = (SInt16*)malloc(8192 * sizeof(SInt16));
int a = 0;
while (stillToCopy > 0) {
sampleBufferRef = [assetReaderOutput copyNextSampleBuffer];
if (!sampleBufferRef) {
// end of song or no more samples
return;
}
CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBufferRef);
CMItemCount numSamplesInBuffer = CMSampleBufferGetNumSamples(sampleBufferRef);
AudioBufferList audioBufferList;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBufferRef,
NULL,
&audioBufferList,
sizeof(audioBufferList),
NULL,
NULL,
0,
&blockBuffer);
int data_length = floorf(numSamplesInBuffer * 1.0f);
int j = 0;
for (int bufferCount=0; bufferCount < audioBufferList.mNumberBuffers; bufferCount++) {
SInt16* samples = (SInt16 *)audioBufferList.mBuffers[bufferCount].mData;
for (int i=0; i < numSamplesInBuffer; i++) {
dataBuffer[j] = samples[i];
j++;
}
}
CFRelease(sampleBufferRef);
sampleBuffer->putSamples(dataBuffer, j);
stillToCopy = stillToCopy - data_length;
}
free(dataBuffer);
[w_lock unlock];
[apool release];
}
现在,sampleBuffer 将包含错误解码的样本。谁能帮助我为什么会这样?我的 iTunes 库中的不同文件(mp3、aac、wav 等)会发生这种情况。 任何帮助将不胜感激,此外,如果您需要我的代码的任何其他列表,或者输出听起来像什么,我将根据请求附上它。过去一周我一直在尝试调试它,但在网上没有找到任何帮助——每个人似乎都按照我的方式做,但似乎只有我有这个问题。
感谢您的任何帮助!
彼得
I am currently working on an application as part of my Bachelor in Computer Science. The application will correlate data from the iPhone hardware (accelerometer, gps) and music that is being played.
The project is still in its infancy, having worked on it for only 2 months.
The moment that I am right now, and where I need help, is reading PCM samples from songs from the itunes library, and playing them back using and audio unit.
Currently the implementation I would like working does the following: chooses a random song from iTunes, and reads samples from it when required, and stores in a buffer, lets call it sampleBuffer. Later on in the consumer model the audio unit (which has a mixer and a remoteIO output) has a callback where I simply copy the required number of samples from sampleBuffer into the buffer specified in the callback. What i then hear through the speakers is something not quite what i expect; I can recognize that it is playing the song however it seems that it is incorrectly decoded and it has a lot of noise! I attached an image which shows the first ~half a second (24576 samples @ 44.1kHz), and this does not resemble a normall looking output.
Before I get into the listing I have checked that the file is not corrupted, similarily I have written test cases for the buffer (so I know the buffer does not alter the samples), and although this might not be the best way to do it (some would argue to go the audio queue route), I want to perform various manipulations on the samples aswell as changing the song before it is finished, rearranging what song is played, etc. Furthermore, maybe there are some incorrect settings in the audio unit, however, the graph that displays the samples (which shows the samples are decoded incorrectly) is taken straight from the buffer, thus I am only looking now to solve why the reading from the disk and decoding does not work correctly. Right now i simply want to get a play through working.
Cant post images because new to stackoverflow so heres the link to the image: https://i.sstatic.net/RHjlv.jpg
Listing:
This is where I setup the audioReadSettigns which will be used for the AVAssetReaderAudioMixOutput
// Set the read settings
audioReadSettings = [[NSMutableDictionary alloc] init];
[audioReadSettings setValue:[NSNumber numberWithInt:kAudioFormatLinearPCM]
forKey:AVFormatIDKey];
[audioReadSettings setValue:[NSNumber numberWithInt:16] forKey:AVLinearPCMBitDepthKey];
[audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsBigEndianKey];
[audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsFloatKey];
[audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsNonInterleaved];
[audioReadSettings setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey];
Now the following code listing is a method that receives an NSString with the persistant_id of the song:
-(BOOL)setNextSongID:(NSString*)persistand_id {
assert(persistand_id != nil);
MPMediaItem *song = [self getMediaItemForPersistantID:persistand_id];
NSURL *assetUrl = [song valueForProperty:MPMediaItemPropertyAssetURL];
AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:assetUrl
options:[NSDictionary dictionaryWithObject:[NSNumber numberWithBool:YES]
forKey:AVURLAssetPreferPreciseDurationAndTimingKey]];
NSError *assetError = nil;
assetReader = [[AVAssetReader assetReaderWithAsset:songAsset error:&assetError] retain];
if (assetError) {
NSLog(@"error: %@", assetError);
return NO;
}
CMTimeRange timeRange = CMTimeRangeMake(kCMTimeZero, songAsset.duration);
[assetReader setTimeRange:timeRange];
track = [[songAsset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0];
assetReaderOutput = [AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:[NSArray arrayWithObject:track]
audioSettings:audioReadSettings];
if (![assetReader canAddOutput:assetReaderOutput]) {
NSLog(@"cant add reader output... die!");
return NO;
}
[assetReader addOutput:assetReaderOutput];
[assetReader startReading];
// just getting some basic information about the track to print
NSArray *formatDesc = ((AVAssetTrack*)[[assetReaderOutput audioTracks] objectAtIndex:0]).formatDescriptions;
for (unsigned int i = 0; i < [formatDesc count]; ++i) {
CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
const CAStreamBasicDescription *asDesc = (CAStreamBasicDescription*)CMAudioFormatDescriptionGetStreamBasicDescription(item);
if (asDesc) {
// get data
numChannels = asDesc->mChannelsPerFrame;
sampleRate = asDesc->mSampleRate;
asDesc->Print();
}
}
[self copyEnoughSamplesToBufferForLength:24000];
return YES;
}
The following presents the function -(void)copyEnoughSamplesToBufferForLength:
-(void)copyEnoughSamplesToBufferForLength:(UInt32)samples_count {
[w_lock lock];
int stillToCopy = 0;
if (sampleBuffer->numSamples() < samples_count) {
stillToCopy = samples_count;
}
NSAutoreleasePool *apool = [[NSAutoreleasePool alloc] init];
CMSampleBufferRef sampleBufferRef;
SInt16 *dataBuffer = (SInt16*)malloc(8192 * sizeof(SInt16));
int a = 0;
while (stillToCopy > 0) {
sampleBufferRef = [assetReaderOutput copyNextSampleBuffer];
if (!sampleBufferRef) {
// end of song or no more samples
return;
}
CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBufferRef);
CMItemCount numSamplesInBuffer = CMSampleBufferGetNumSamples(sampleBufferRef);
AudioBufferList audioBufferList;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBufferRef,
NULL,
&audioBufferList,
sizeof(audioBufferList),
NULL,
NULL,
0,
&blockBuffer);
int data_length = floorf(numSamplesInBuffer * 1.0f);
int j = 0;
for (int bufferCount=0; bufferCount < audioBufferList.mNumberBuffers; bufferCount++) {
SInt16* samples = (SInt16 *)audioBufferList.mBuffers[bufferCount].mData;
for (int i=0; i < numSamplesInBuffer; i++) {
dataBuffer[j] = samples[i];
j++;
}
}
CFRelease(sampleBufferRef);
sampleBuffer->putSamples(dataBuffer, j);
stillToCopy = stillToCopy - data_length;
}
free(dataBuffer);
[w_lock unlock];
[apool release];
}
Now the sampleBuffer will have incorrectly decoded samples. Can anyone help me why this is so? This happens for different files on my iTunes library (mp3, aac, wav, etc).
Any help would be greatly appreciated, furthermore, if you need any other listing of my code, or perhaps what the output sounds like, I will attach it per request. I have been sitting on this for the past week trying to debug it and have found no help online -- everyone seems to be doign it in my way, yet it seems that only I have this issue.
Thanks for any help at all!
Peter
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
目前,我还在从事一个项目,该项目涉及将音频样本从 iTunes 库提取到 AudioUnit 中。
包含音频单元渲染回调供您参考。输入格式设置为 SInt16StereoStreamFormat。
我使用了 Michael Tyson 的循环缓冲区实现 - TPCircularBuffer 作为缓冲区存储。非常容易使用和理解!谢谢迈克尔!
Currently, I am also working on a project which involves extracting audio samples from iTunes Library into AudioUnit.
The audiounit render call back is included for your reference. The input format is set as SInt16StereoStreamFormat.
I have made use of Michael Tyson's circular buffer implementation - TPCircularBuffer as the buffer storage. Very easy to use and understand!!! Thanks Michael!
我想有点晚了,但你可以尝试这个库:
https://bitbucket.org/artgillespie/tslibraryimport< /a>
使用它将音频保存到文件后,您可以使用来自 MixerHost 的渲染回调来处理数据。
I guess it is kind of late, but you could try this library:
https://bitbucket.org/artgillespie/tslibraryimport
After using this to save the audio into a file, you could process the data with render callbacks from MixerHost.
如果我是你,我会使用 kAudioUnitSubType_AudioFilePlayer 来播放文件并使用单位渲染回调访问其样本。
或者
使用 ExtAudioFileRef 将样本直接提取到缓冲区。
If I were you I would either use kAudioUnitSubType_AudioFilePlayer to play the file and access its samples with the units render callback.
Or
Use ExtAudioFileRef to extract the samples straight to a buffer.