我知道这里已经有几个问题了,但是它们并未完全解释答案中介绍的公式。
我正在编写一个解析器,该解析器应该能够处理MPEG-1,2,2.5音频层I,II,III框架标头。
目的是计算框架的确切尺寸,包括标题, crc (如果存在)和任何 data 或或 metadata < /strong>该框架(基本上是一个标头开始与下一个字节开始之间的字节数)。
在Internet上通常看到的代码段之一/公式之一是
(在没有特定的编程语言中):
padding = doesThisFramehavePadding ? 1 : 0;
coefficient = sampleCount / 8;
// makes sense to me. the slot size seems to be the smallest addressable space in an mp3 frame
// and is thus important for padding.
slotSize = mpegLayer == Layer1 ? 4 : 1;
// all fine here. bitRate / sampleRate yields bits per sample, multiplied by that weird
// coefficient from earlier probably gives us <total bytes> per <all samples in this frame>.
// then add padding times slotSize.
frameSizeInBytes = ((coefficient * bitRate / sampleRate) + padding) * slotSize;
关于上述代码段我有多个问题:
- 这个“系数”到底是什么?因为它只是
samplecount / 8 < / code>,可能只是用于在最终计算中将单元从位转换为字节的东西,对吗?
- 如果我的假设是1。是正确的:如果
(系数 * bittrate / samplerate)< / code>已经在字节中产生某些东西,那么我专门为音频层实现了插槽大小而将其倍增?这是否意味着的单位(系数 * bittrate / samplatier)< / code>应该是较早的“插槽”,而不是“字节”?如果是这样,那么系数会做什么,例如为什么要除以8,即使是音频第1层帧?这甚至正确吗?
- 问题1和2。使我相信上面的代码段甚至可能是不正确的。如果是这样,MPEG版本1,2,3.5和I,II和III的正确计算将是什么样的?
- 如果在框架标头中设置了CRC保护位,是否仍会产生正确的结果(即将16个附加的CRC字节附加到标头上)?
- 说到标题:生成的
frameizeInbytes
中包含4个标题字节,还是结果表示帧数据/主体的长度?
基本上,所有这些子问题都可以总结到:
是什么公式是什么公式,包括标头在内的字节中当前帧的总长度,以及CRC,例如CRC或Xing和Lake Meta数据框架等最终?
I know there are already a few questions like this here on SO, however they do not fully explain the formulas presented in the answers.
Im writing a parser that should be able to process MPEG-1,2,2.5 Audio Layer I,II,III frame headers.
The goal is to calculate the exact size of the frame, including header, CRC (if present) and any data or metadata of this frame (basically the number of bytes between the start of one header and the beginning of the next one).
One of the code snippets/formulas commonly seen on the internet to achieve this is
(in no specific programming language):
padding = doesThisFramehavePadding ? 1 : 0;
coefficient = sampleCount / 8;
// makes sense to me. the slot size seems to be the smallest addressable space in an mp3 frame
// and is thus important for padding.
slotSize = mpegLayer == Layer1 ? 4 : 1;
// all fine here. bitRate / sampleRate yields bits per sample, multiplied by that weird
// coefficient from earlier probably gives us <total bytes> per <all samples in this frame>.
// then add padding times slotSize.
frameSizeInBytes = ((coefficient * bitRate / sampleRate) + padding) * slotSize;
I have multiple questions regarding above code snippet:
- What exactly would this "coefficient" even represent? As it's just
sampleCount / 8
it's probably just something used to convert the units from bits to bytes in the final calculation, right?
- If my assumption from 1. is correct: if
(coefficient * bitRate / sampleRate)
already yields something in bytes what would multiplying it with the slot size achieve for Audio Layer I specifically? Wouldn't this imply that the unit of (coefficient * bitRate / sampleRate)
should have been "slots" earlier, not "bytes"? If so, then what does the coefficient do, like why divide by 8, even for audio layer 1 frames? Is this even correct?
- Questions 1. and 2. lead me to believe that the code snippet above may not even be correct. If so what would the correct calculation for MPEG versions 1,2,3.5 and layers I,II and III look like?
- Does above calculation still yield the correct result if the CRC protection bit is set in the frame header (i.e. 16 additional CRC bytes are appended to the header)?
- Speaking of the header: are the 4 header bytes included in the resulting
frameSizeInBytes
or does the result indicate the length of the frame data/body?
Basically all these sub-questions can be summarized to:
What is the formula to calculate the total and exact length of the current frame in bytes, including the header, and stuff like CRC, or Xing and LAME meta data frames and other eventualities?
发布评论
评论(1)
我写道,在Delphi/Pascal中,该功能返回
0
的范围不好或其确切的字节大小。它基于多个网站 - 前两个说明并解释了一个具有完整精度的MPEG音频框架标题,而第三个则具有关键添加,例如公式(S):对于第I层向我们归档了此公式:
FrameLengthinbytes =(12 * bitrate / samplerate + Padding) * 4 < / code>
对于第二层&amp; iii文件使用此公式:
FrameLengthnbytes = 144 * bittrate/samplater + Padding
如果该功能返回
0
您很可能在任何元数据标签的区域中。计算出的帧大小用于其有效载荷=内容,并且不计算标头数据的4个字节。这正是在文件中寻求向前的字节的数量。我将其写入以确切计数的MP3文件中编码的可变比特率,其中框架大小的长度可能非常不同。我受够了懒惰的总体计算,只能做猜测。
也可以很好地检测到不包含音频的“特殊” VBR帧,而是其他信息也可以很好地检测到。为此,我们需要知道帧的“侧面信息”:
您可能还需要阅读
...这也很有用要知道要在哪里找到第一个音频框架(在文件开始时标记之后)以及您到达最后一个音频框架(在文件末尾的标签之前)。
I wrote that in Delphi/Pascal and the function returns either
0
for a bad frame or its exact size of bytes. It is based on multiple websites - the first two illustrate and explains an MPEG audio frame header with full precision, while the third has crucial additions like the formula(s):If the function returns
0
you're most likely in any metadata tag's area. The calculated frame size is for its payload=content and does not count the 4 bytes of header data. It's exactly the amount of bytes to seek forward in the file to be in front of the next frame's headers.I wrote this to exactly count frames in MP3 files encoded with variable bitrates, where frame sizes can have very different lengths. And I was fed up with lazy overall calculations that would only do guesswork.
The "special" VBR frames that don't contain audio but instead additional info can be fairly well detected, too. For this we need to know the "side info" of a frame:
You may also want to read
...which is also useful to know where the first audio frame is to be found (after tags at the start of the file) and when you've reached the last one (before tags at the end of the file).