mp3精确的框架尺寸计算

发布于 2025-02-02 22:11:12 字数 1663 浏览 1 评论 0 原文

我知道这里已经有几个问题了，但是它们并未完全解释答案中介绍的公式。

我正在编写一个解析器，该解析器应该能够处理MPEG-1,2,2.5音频层I，II，III框架标头。目的是计算框架的确切尺寸，包括标题， crc （如果存在）和任何 data 或或 metadata < /strong>该框架（基本上是一个标头开始与下一个字节开始之间的字节数）。

在Internet上通常看到的代码段之一/公式之一是（在没有特定的编程语言中）：

padding = doesThisFramehavePadding ? 1 : 0;
coefficient = sampleCount / 8;

// makes sense to me. the slot size seems to be the smallest addressable space in an mp3 frame 
// and is thus important for padding.
slotSize = mpegLayer == Layer1 ? 4 : 1;

// all fine here. bitRate / sampleRate yields bits per sample, multiplied by that weird 
// coefficient from earlier probably gives us <total bytes> per <all samples in this frame>.
// then add padding times slotSize.
frameSizeInBytes = ((coefficient * bitRate / sampleRate) + padding) * slotSize;

关于上述代码段我有多个问题：

这个“系数”到底是什么？因为它只是 samplecount / 8 < / code>，可能只是用于在最终计算中将单元从位转换为字节的东西，对吗？
如果我的假设是1。是正确的：如果（系数 * bittrate / samplerate）< / code>已经在字节中产生某些东西，那么我专门为音频层实现了插槽大小而将其倍增？这是否意味着的单位（系数 * bittrate / samplatier）< / code>应该是较早的“插槽”，而不是“字节”？如果是这样，那么系数会做什么，例如为什么要除以8，即使是音频第1层帧？这甚至正确吗？
问题1和2。使我相信上面的代码段甚至可能是不正确的。如果是这样，MPEG版本1,2,3.5和I，II和III的正确计算将是什么样的？
如果在框架标头中设置了CRC保护位，是否仍会产生正确的结果（即将16个附加的CRC字节附加到标头上）？
说到标题：生成的 frameizeInbytes 中包含4个标题字节，还是结果表示帧数据/主体的长度？

基本上，所有这些子问题都可以总结到：

是什么公式是什么公式，包括标头在内的字节中当前帧的总长度，以及CRC，例如CRC或Xing和Lake Meta数据框架等最终？

原文

I know there are already a few questions like this here on SO, however they do not fully explain the formulas presented in the answers.

Im writing a parser that should be able to process MPEG-1,2,2.5 Audio Layer I,II,III frame headers.
The goal is to calculate the exact size of the frame, including header, CRC (if present) and any data or metadata of this frame (basically the number of bytes between the start of one header and the beginning of the next one).

One of the code snippets/formulas commonly seen on the internet to achieve this is
(in no specific programming language):

padding = doesThisFramehavePadding ? 1 : 0;
coefficient = sampleCount / 8;

// makes sense to me. the slot size seems to be the smallest addressable space in an mp3 frame 
// and is thus important for padding.
slotSize = mpegLayer == Layer1 ? 4 : 1;

// all fine here. bitRate / sampleRate yields bits per sample, multiplied by that weird 
// coefficient from earlier probably gives us <total bytes> per <all samples in this frame>.
// then add padding times slotSize.
frameSizeInBytes = ((coefficient * bitRate / sampleRate) + padding) * slotSize;

I have multiple questions regarding above code snippet:

What exactly would this "coefficient" even represent? As it's just sampleCount / 8 it's probably just something used to convert the units from bits to bytes in the final calculation, right?
If my assumption from 1. is correct: if (coefficient * bitRate / sampleRate) already yields something in bytes what would multiplying it with the slot size achieve for Audio Layer I specifically? Wouldn't this imply that the unit of (coefficient * bitRate / sampleRate) should have been "slots" earlier, not "bytes"? If so, then what does the coefficient do, like why divide by 8, even for audio layer 1 frames? Is this even correct?
Questions 1. and 2. lead me to believe that the code snippet above may not even be correct. If so what would the correct calculation for MPEG versions 1,2,3.5 and layers I,II and III look like?
Does above calculation still yield the correct result if the CRC protection bit is set in the frame header (i.e. 16 additional CRC bytes are appended to the header)?
Speaking of the header: are the 4 header bytes included in the resulting frameSizeInBytes or does the result indicate the length of the frame data/body?

Basically all these sub-questions can be summarized to:

What is the formula to calculate the total and exact length of the current frame in bytes, including the header, and stuff like CRC, or Xing and LAME meta data frames and other eventualities?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梅窗月明清似水 2025-02-09 22:11:12

我写道，在Delphi/Pascal中，该功能返回 0 的范围不好或其确切的字节大小。它基于多个网站 - 前两个说明并解释了一个具有完整精度的MPEG音频框架标题，而第三个则具有关键添加，例如公式（S）：

http://checkmate.gissen.nl/headers.php （第二次副本，但带有更多颜色）
http://www.mp3-tech.org/progmmer/framegmer/frame_header.html
对于第I层向我们归档了此公式：FrameLengthinbytes =（12 * bitrate / samplerate + Padding） * 4 < / code> 对于第二层＆amp; iii文件使用此公式： FrameLengthnbytes = 144 * bittrate/samplater + Padding
带有一个很好的图片，说明了标题
https://www.codeproject.com/articles/8295/mpeg-audio-frame-header 有关VBR框架的更多详细信息以及如何计算整体播放持续时间

const
  MPEG_BITRATE: Array[0.. 1, 1.. 3, 0.. 14] of Word=  // MPEG 2/1, Layer III/II/I
  ( ( ( 0,  8, 16, 24,  32,  40,  48,  56,  64,  80,  96, 112, 128, 144, 160 )  // 2 Layer III
    , ( 0,  8, 16, 24,  32,  40,  48,  56,  64,  80,  96, 112, 128, 144, 160 )  // 2 Layer II
    , ( 0, 32, 48, 56,  64,  80,  96, 112, 128, 144, 160, 176, 192, 224, 256 )  // 2 Layer I
    )
  , ( ( 0, 32, 40, 48,  56,  64,  80,  96, 112, 128, 160, 192, 224, 256, 320 )  // 1 Layer III
    , ( 0, 32, 48, 56,  64,  80,  96, 112, 128, 160, 192, 224, 256, 320, 384 )  // 1 Layer II
    , ( 0, 32, 64, 96, 128, 160, 192, 224, 256, 288, 320, 352, 384, 416, 448 )  // 1 Layer I
    )
  );

  MPEG_SAMPLERATE: Array[0.. 3, 0.. 2] of Word=  // MPEG 2.5/?/2/1
  ( ( 11025, 12000,  8000 )
  , (     0,     0,     0 )
  , ( 22050, 24000, 16000 )
  , ( 44100, 48000, 32000 )
  );


// Read from a file and give back a positive 16-bit value of the PAYLOAD size,
// excluding the 4 bytes header size. Make sure we can read at least 4 byte off the 
// file. If a non-standard condition is met, the function exits with size 0,
// indicating a bad frame.
function IsValidMpegHeader( oIn: TStream ): Word;
var
  aHead: Array[1.. 4] of Byte;  // 4 bytes.
  iBitRateKilo, iSampleRate: Word;  // 16-bit; looked up from the array constants above.
  iPadding, iSlotSize, iSamples: Byte;  // 8-bit.
begin
  oIn.Read( aHead[1], 4 );  // Read next 4 bytes into array.



  // 11 bits sync:
  if (aHead[1]<> $FF) then exit;  // First 8 bits.
  if (aHead[2] and $E0)<> $E0 then exit;  // Next 3 bits.

  // 2 bits MPEG version:
  if (aHead[2] and $18)= $08 then exit;  // $00=2.5; $08=reserved; $10=2; $18=1

  // 2 bits Audio Layer:
  if (aHead[2] and $06)= $00 then exit;  // $00=reserved; $02=III; $04=II; $06=I

  // 1 bit "Protection" flag. End of 16 bits.



  // 4 bits Bitrate:
  if (aHead[3] and $F0)= $F0 then exit;  // 0=free, thus allowed; all 4 bits set=bad

  // 2 bits Frequency:
  if (aHead[3] and $0C)= $0C then exit;  // All bits=reserved.

  // 1 bit "Padding" flag.

  // 1 bit "Private" flag. End of 24 bits.



  // 2 bits "Channel Mode": 0=stereo; 1=joint stereo; 2=dual channel; 3=mono

  // 2 bits Mode Extension.

  // 1 bit "Copyright" flag.

  // 1 bit "Original" flag.

  // 2 bits Emphasis. End of 32 bit.
  if (aHead[4] and $03)= $02 then exit;  // $00=none; $01=50/15 ms; $02=reserved; $03=CCIT J.17



  // 1 upper bit from 2nd byte, shifted 3 bits to the right  = MPEG version
  // 2 bits      from 2nd byte, shifted 1 bit  to the right  = Audio Layer
  // 4 bits      from 3rd byte, shifted 4 bits to the right  = Bitrate
  iBitRateKilo:= MPEG_BITRATE[(aHead[2] shr 3) and 1][(aHead[2] shr 1) and 3][(aHead[3] shr 4) and $F];

  // Layer II disallows specific combinations.
  if (aHead[2] and $06)= $04 then
  case iBitRateKilo of
    32, 48, 56, 80:     if (aHead[4] and $C0)<> $C0 then exit;  // Only single channel allowed.
    224, 256, 320, 384: if (aHead[4] and $C0)= $C0 then exit;  // No single channel allowed.
  end;

  // Samples per frame in bytes, not bits.
  if (aHead[2] and $18)= $18 then begin  // MPEG v1
    case aHead[2] and $06 of
      $06: iSamples:= 12;  // Layer I
    else 
      iSamples:= 144;  // Layer II and III
    end;
  end else begin  // MPEG v2 and v2.5
    case aHead[2] and $06 of
      $06: iSamples:= 12;  // Layer I
      $04: iSamples:= 144;  // Layer II
    else 
      iSamples:= 72;  // Layer III
    end;
  end;

  // Set slot size and padding (in bytes).
  if (aHead[2] and $06)= $06 then iSlotSize:= 4 else iSlotSize:= 1;  // Layer I = 32 bits.
  if (aHead[3] and $02)= $02 then iPadding := 1 else iPadding := 0;  // Padding bit.

  // 2 bits from second byte, shifted 3 bits to the right  = MPEG version
  // 2 bits from third byte,  shifted 2 bits to the right  = Frequency
  iSampleRate:= MPEG_SAMPLERATE[(aHead[2] shr 3) and 3][(aHead[3] shr 2) and 3];
  if iSampleRate= 0 then exit;


  // The division itself is a real/float one, not an Integer division. The quotient
  // must not be rounded, but instead its Integer part must be cut off from any decimals. 
  // If it is 1152.9 then it still means 1152 bytes, not 1153. This calculation works
  // for all MPEG versions, not just v1.
  result:= Trunc( ((iSamples* iBitRateKilo* 1000/ iSampleRate)+ iPadding)* iSlotSize );


  (* Originally I thought the hash sum would make the frame bigger, but after experiencing 
     a couple of files the 2 CRC bytes are meant to be in the frame payload already. This
     is also confirmed by https://hydrogenaud.io/index.php/topic,119033.0.html indicating
     that this was never meant for (stored) files, but instead only for (network) transmissions
     and would indeed waste 16 valuable bits.
  if (aHead[2] and $01)= $00 then Inc( result, 2 );  // 16-bit CRC after header. *)
end;

如果该功能返回 0 您很可能在任何元数据标签的区域中。计算出的帧大小用于其有效载荷=内容，并且不计算标头数据的4个字节。这正是在文件中寻求向前的字节的数量。

是的。
围绕公式，这可以解释更好：

填充用于完全适合比特率。示例：128K 44.1KHz第II层使用很多418个字节，而417个字节长帧中的某些则可以获得确切的128K比特率。对于I层的插槽长32位，对于II层和第三层插槽长8位。

首先，让我们区分两个框架尺寸和框架长度。帧大小是框架中包含的样品数量。它是恒定的，对于I层和III层和第三层的1152个样品总是384个样本。帧长度是压缩时框架的长度。它是在插槽中计算的。一个插槽是第I层长4个字节，一个字节长于II层和第三层。当您读取MPEG文件时，您必须对此进行计算以找到每个连续的帧。请记住，由于填充或比特率切换，框架长度可能会因框架而变化。
这是正确的（尽管我也不完全理解它的写作）。自从我编写代码以来，我已经使用了所有MP3的所有变体对其进行测试，并且我总是以准确的预期位置找到下一个帧。
是的，因为“ 添加到标题”仅表示主题明智而不是上下文。框架有效载荷的前两个字节恰恰是16位CRC哈希。
不，框架标头尺寸始终为4个字节，不包含在框架长度中。

我将其写入以确切计数的MP3文件中编码的可变比特率，其中框架大小的长度可能非常不同。我受够了懒惰的总体计算，只能做猜测。

也可以很好地检测到不包含音频的“特殊” VBR帧，而是其他信息也可以很好地检测到。为此，我们需要知道帧的“侧面信息”：

const
  // https://www.codeproject.com/Articles/8295/MPEG-Audio-Frame-Header
  MPEG_SIDEINFO: Array[0.. 1, FALSE.. TRUE] of Byte=   // MPEG 2/1, Mono/Non-mono
  ( (  9, 17 )
  , ( 17, 32 )  // Only MPEG 1 non-mono has the offset after 32 bytes
  );


// Returns TRUE if one of the identifications matches.
function IsVbrFrame( oIn: TStream ): Boolean;
var
  iSideInfo: Byte;
  aIdent: Array[1.. 4] of Char;  // Like bytes, but treating it as ASCII.
begin
  // 1 upper bit from 2nd byte, shifted 3 bits to the right  = MPEG version
  // 2 highest bits from 4th byte (Channel Mode) equal mode "Mono"?
  iSideInfo:= MPEG_SIDEINFO[(aHead[2] shr 3) and 1][(aHead[4] and $C0)<> $C0];

  // After we read the 4 bytes from the header, go forward either 9, 17 or 32 
  // bytes and read 4 bytes of identification for almost any VBR frame.
  oIn.Seek( iSideInfo, soCurrent );
  oIn.Read( aIdent[1], 4 );

  if (aIdent= 'Xing')
  or (aIdent= 'Info')
  or (aIdent= 'LAME')
  or (aIdent= 'UUUU')
  or (aIdent= 'GOGO')
  or (aIdent= 'MPGE') then begin
    result:= TRUE;
  end else begin
    // Go back the 4 bytes we just read and the sideinfo portion we skipped
    // to then always jump 32 bytes forwards, regardless of MPEG version and
    // Channel Mode. Then read 4 bytes again and check for the only known ID.
    oIn.Seek( 0- 4- iSideInfo, soCurrent );
    oIn.Seek( 32, soCurrent );
    oIn.Read( aIdent[1], 4 );

    result:= (aIdent= 'VBRI');
  end;
end;

您可能还需要阅读

...这也很有用要知道要在哪里找到第一个音频框架（在文件开始时标记之后）以及您到达最后一个音频框架（在文件末尾的标签之前）。

I wrote that in Delphi/Pascal and the function returns either 0 for a bad frame or its exact size of bytes. It is based on multiple websites - the first two illustrate and explains an MPEG audio frame header with full precision, while the third has crucial additions like the formula(s):

http://checkmate.gissen.nl/headers.php (copy of the second, but with more colors)
http://www.mp3-tech.org/programmer/frame_header.html
http://mpgedit.org/mpgedit/mpeg_format/mpeghdr.htm (might be the original and contains the formula calculating the actual byte length of a frame for MPEG 1):

For Layer I files us this formula: FrameLengthInBytes = (12 * BitRate / SampleRate + Padding) * 4
For Layer II & III files use this formula: FrameLengthInBytes = 144 * BitRate / SampleRate + Padding
https://en.wikipedia.org/wiki/MP3#File_structure with a good picture illustrating the header
https://www.codeproject.com/Articles/8295/MPEG-Audio-Frame-Header for even more details about VBR frames and how to calculate the overall playback duration

const
  MPEG_BITRATE: Array[0.. 1, 1.. 3, 0.. 14] of Word=  // MPEG 2/1, Layer III/II/I
  ( ( ( 0,  8, 16, 24,  32,  40,  48,  56,  64,  80,  96, 112, 128, 144, 160 )  // 2 Layer III
    , ( 0,  8, 16, 24,  32,  40,  48,  56,  64,  80,  96, 112, 128, 144, 160 )  // 2 Layer II
    , ( 0, 32, 48, 56,  64,  80,  96, 112, 128, 144, 160, 176, 192, 224, 256 )  // 2 Layer I
    )
  , ( ( 0, 32, 40, 48,  56,  64,  80,  96, 112, 128, 160, 192, 224, 256, 320 )  // 1 Layer III
    , ( 0, 32, 48, 56,  64,  80,  96, 112, 128, 160, 192, 224, 256, 320, 384 )  // 1 Layer II
    , ( 0, 32, 64, 96, 128, 160, 192, 224, 256, 288, 320, 352, 384, 416, 448 )  // 1 Layer I
    )
  );

  MPEG_SAMPLERATE: Array[0.. 3, 0.. 2] of Word=  // MPEG 2.5/?/2/1
  ( ( 11025, 12000,  8000 )
  , (     0,     0,     0 )
  , ( 22050, 24000, 16000 )
  , ( 44100, 48000, 32000 )
  );


// Read from a file and give back a positive 16-bit value of the PAYLOAD size,
// excluding the 4 bytes header size. Make sure we can read at least 4 byte off the 
// file. If a non-standard condition is met, the function exits with size 0,
// indicating a bad frame.
function IsValidMpegHeader( oIn: TStream ): Word;
var
  aHead: Array[1.. 4] of Byte;  // 4 bytes.
  iBitRateKilo, iSampleRate: Word;  // 16-bit; looked up from the array constants above.
  iPadding, iSlotSize, iSamples: Byte;  // 8-bit.
begin
  oIn.Read( aHead[1], 4 );  // Read next 4 bytes into array.



  // 11 bits sync:
  if (aHead[1]<> $FF) then exit;  // First 8 bits.
  if (aHead[2] and $E0)<> $E0 then exit;  // Next 3 bits.

  // 2 bits MPEG version:
  if (aHead[2] and $18)= $08 then exit;  // $00=2.5; $08=reserved; $10=2; $18=1

  // 2 bits Audio Layer:
  if (aHead[2] and $06)= $00 then exit;  // $00=reserved; $02=III; $04=II; $06=I

  // 1 bit "Protection" flag. End of 16 bits.



  // 4 bits Bitrate:
  if (aHead[3] and $F0)= $F0 then exit;  // 0=free, thus allowed; all 4 bits set=bad

  // 2 bits Frequency:
  if (aHead[3] and $0C)= $0C then exit;  // All bits=reserved.

  // 1 bit "Padding" flag.

  // 1 bit "Private" flag. End of 24 bits.



  // 2 bits "Channel Mode": 0=stereo; 1=joint stereo; 2=dual channel; 3=mono

  // 2 bits Mode Extension.

  // 1 bit "Copyright" flag.

  // 1 bit "Original" flag.

  // 2 bits Emphasis. End of 32 bit.
  if (aHead[4] and $03)= $02 then exit;  // $00=none; $01=50/15 ms; $02=reserved; $03=CCIT J.17



  // 1 upper bit from 2nd byte, shifted 3 bits to the right  = MPEG version
  // 2 bits      from 2nd byte, shifted 1 bit  to the right  = Audio Layer
  // 4 bits      from 3rd byte, shifted 4 bits to the right  = Bitrate
  iBitRateKilo:= MPEG_BITRATE[(aHead[2] shr 3) and 1][(aHead[2] shr 1) and 3][(aHead[3] shr 4) and $F];

  // Layer II disallows specific combinations.
  if (aHead[2] and $06)= $04 then
  case iBitRateKilo of
    32, 48, 56, 80:     if (aHead[4] and $C0)<> $C0 then exit;  // Only single channel allowed.
    224, 256, 320, 384: if (aHead[4] and $C0)= $C0 then exit;  // No single channel allowed.
  end;

  // Samples per frame in bytes, not bits.
  if (aHead[2] and $18)= $18 then begin  // MPEG v1
    case aHead[2] and $06 of
      $06: iSamples:= 12;  // Layer I
    else 
      iSamples:= 144;  // Layer II and III
    end;
  end else begin  // MPEG v2 and v2.5
    case aHead[2] and $06 of
      $06: iSamples:= 12;  // Layer I
      $04: iSamples:= 144;  // Layer II
    else 
      iSamples:= 72;  // Layer III
    end;
  end;

  // Set slot size and padding (in bytes).
  if (aHead[2] and $06)= $06 then iSlotSize:= 4 else iSlotSize:= 1;  // Layer I = 32 bits.
  if (aHead[3] and $02)= $02 then iPadding := 1 else iPadding := 0;  // Padding bit.

  // 2 bits from second byte, shifted 3 bits to the right  = MPEG version
  // 2 bits from third byte,  shifted 2 bits to the right  = Frequency
  iSampleRate:= MPEG_SAMPLERATE[(aHead[2] shr 3) and 3][(aHead[3] shr 2) and 3];
  if iSampleRate= 0 then exit;


  // The division itself is a real/float one, not an Integer division. The quotient
  // must not be rounded, but instead its Integer part must be cut off from any decimals. 
  // If it is 1152.9 then it still means 1152 bytes, not 1153. This calculation works
  // for all MPEG versions, not just v1.
  result:= Trunc( ((iSamples* iBitRateKilo* 1000/ iSampleRate)+ iPadding)* iSlotSize );


  (* Originally I thought the hash sum would make the frame bigger, but after experiencing 
     a couple of files the 2 CRC bytes are meant to be in the frame payload already. This
     is also confirmed by https://hydrogenaud.io/index.php/topic,119033.0.html indicating
     that this was never meant for (stored) files, but instead only for (network) transmissions
     and would indeed waste 16 valuable bits.
  if (aHead[2] and $01)= $00 then Inc( result, 2 );  // 16-bit CRC after header. *)
end;

If the function returns 0 you're most likely in any metadata tag's area. The calculated frame size is for its payload=content and does not count the 4 bytes of header data. It's exactly the amount of bytes to seek forward in the file to be in front of the next frame's headers.

Yes.
Around the formulas this is explained a bit better:

Padding is used to fit the bit rates exactly. For an example: 128k 44.1kHz layer II uses a lot of 418 bytes and some of 417 bytes long frames to get the exact 128k bitrate. For Layer I slot is 32 bits long, for Layer II and Layer III slot is 8 bits long.

First, let's distinguish two terms frame size and frame length. Frame size is the number of samples contained in a frame. It is constant and always 384 samples for Layer I and 1152 samples for Layer II and Layer III. Frame length is length of a frame when compressed. It is calculated in slots. One slot is 4 bytes long for Layer I, and one byte long for Layer II and Layer III. When you are reading MPEG file you must calculate this to be able to find each consecutive frame. Remember, frame length may change from frame to frame due to padding or bitrate switching.
It is correct (although I wouldn't fully understand it either as written there). Since I wrote my code I've tested it with all variants of MP3s and I've always found the next frame at exactly the expected position.
Yes, because "added to the header" merely means topic wise, not context wise. Precisely the first 2 bytes of the frame payload are for the 16-bit CRC hash.
No, frame header size is always 4 bytes and not included in the frame length.

I wrote this to exactly count frames in MP3 files encoded with variable bitrates, where frame sizes can have very different lengths. And I was fed up with lazy overall calculations that would only do guesswork.

The "special" VBR frames that don't contain audio but instead additional info can be fairly well detected, too. For this we need to know the "side info" of a frame:

const
  // https://www.codeproject.com/Articles/8295/MPEG-Audio-Frame-Header
  MPEG_SIDEINFO: Array[0.. 1, FALSE.. TRUE] of Byte=   // MPEG 2/1, Mono/Non-mono
  ( (  9, 17 )
  , ( 17, 32 )  // Only MPEG 1 non-mono has the offset after 32 bytes
  );


// Returns TRUE if one of the identifications matches.
function IsVbrFrame( oIn: TStream ): Boolean;
var
  iSideInfo: Byte;
  aIdent: Array[1.. 4] of Char;  // Like bytes, but treating it as ASCII.
begin
  // 1 upper bit from 2nd byte, shifted 3 bits to the right  = MPEG version
  // 2 highest bits from 4th byte (Channel Mode) equal mode "Mono"?
  iSideInfo:= MPEG_SIDEINFO[(aHead[2] shr 3) and 1][(aHead[4] and $C0)<> $C0];

  // After we read the 4 bytes from the header, go forward either 9, 17 or 32 
  // bytes and read 4 bytes of identification for almost any VBR frame.
  oIn.Seek( iSideInfo, soCurrent );
  oIn.Read( aIdent[1], 4 );

  if (aIdent= 'Xing')
  or (aIdent= 'Info')
  or (aIdent= 'LAME')
  or (aIdent= 'UUUU')
  or (aIdent= 'GOGO')
  or (aIdent= 'MPGE') then begin
    result:= TRUE;
  end else begin
    // Go back the 4 bytes we just read and the sideinfo portion we skipped
    // to then always jump 32 bytes forwards, regardless of MPEG version and
    // Channel Mode. Then read 4 bytes again and check for the only known ID.
    oIn.Seek( 0- 4- iSideInfo, soCurrent );
    oIn.Seek( 32, soCurrent );
    oIn.Read( aIdent[1], 4 );

    result:= (aIdent= 'VBRI');
  end;
end;

You may also want to read

...which is also useful to know where the first audio frame is to be found (after tags at the start of the file) and when you've reached the last one (before tags at the end of the file).

回复收藏 0 原文

~没有更多了~