当前位置：文江博客话题详情

读取 Ogg/Flac 文件的标签数据

发布于 2024-08-14 08:29:57 字数 533 浏览 11 评论 0原文

我正在开发一个从音乐文件中读取标签信息的 C 库。我已经处理了 ID3v2，但我无法弄清楚 Ogg 文件的结构。

我在十六进制编辑器中打开了一个 .ogg 文件，我可以找到标签数据，因为这些数据都是人类可读的。但从文件开头到标签数据的所有内容看起来都是垃圾。这些数据是如何编码的？

我在实际代码中不需要任何帮助，我只需要帮助可视化 Ogg 标头的外观以及它使用的编码，以便我可以阅读它。我想使用一种非 hacky 的方法来读取 Ogg 文件。

我一直在研究 Flac 格式，这很有帮助。

我正在查看的 Flac 文件在“fLac”标识符和人类可读的注释部分之间大约有 350 个字节，并且在我的十六进制编辑器中没有一个是人类可读的，所以我确信必须有一些东西在那里很重要。

我正在使用 Linux，并且无意移植到 Windows 或 OS X。因此，如果我需要使用仅限 glibc 的函数来转换编码，我可以接受。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

勿忘心安 2024-08-21 08:29:57

Ogg 文件格式记录在此处。根据您的要求，有一个非常好的图形可视化和详细的书面描述。

您可能还想查看 libogg，它是一个用于读写 Ogg 的开源 BSD 许可库文件。

回复收藏 0 原文

暮凉 2024-08-21 08:29:57

正如您提供的链接中所述，以下元数据块可能出现在“fLaC”标记和 VORBIS_COMMENT 元数据块之间。

STREAMINFO：该块包含有关整个流的信息，例如采样率、通道数、样本总数等。它必须作为流中的第一个元数据块出现。其他元数据块可能会跟随，而解码器不理解的元数据块将跳过。
应用程序：此块供第三方应用程序使用。唯一的强制字段是 32 位标识符。该 ID 是根据 FLAC 维护者向应用程序提出请求而授予的。该块的其余部分由注册的应用程序定义。如果您想向 FLAC 注册您的应用程序的 ID，请访问注册页面。
PADDING：此块允许任意数量的填充。 PADDING 块的内容没有任何意义。当已知元数据将在编码后编辑时，此块很有用；用户可以指示编码器保留足够大小的 PADDING 块，以便在添加元数据时，它会简单地覆盖填充（相对较快），而不必将其插入到现有文件中的正确位置（这将通常需要重写整个文件）。
SEEKTABLE：这是一个用于存储搜索点的可选块。可以在没有查找表的情况下查找 FLAC 流中的任何给定样本，但延迟可能是不可预测的，因为流内的比特率可能变化很大。通过向流添加查找点，可以显着减少这种延迟。每个查找点占用 18 个字节，因此流中 1% 的分辨率增加的值不到 2k。流中只能有一个 SEEKTABLE，但该表可以有任意数量的查找点。还有一个特殊的“占位符”搜索点，它会被解码器忽略，但可用于为将来的搜索点插入保留空间。

在上面的描述之后，还有每个块的格式规范。该链接还说

FLAC 比特流中使用的所有数字都是整数；没有浮点表示。所有数字均采用大端编码。除非另有说明，所有数字均无符号。

那么，你还缺少什么？你说

我想要一种非 hacky 的方法来读取 Ogg 文件。

当它们已经存在时，为什么要重新编写一个库来做到这一点？

As is described in the link you provided, the following metadata blocks can occur between the "fLaC" marker and the VORBIS_COMMENT metadata block.

STREAMINFO: This block has information about the whole stream, like sample rate, number of channels, total number of samples, etc. It must be present as the first metadata block in the stream. Other metadata blocks may follow, and ones that the decoder doesn't understand, it will skip.
APPLICATION: This block is for use by third-party applications. The only mandatory field is a 32-bit identifier. This ID is granted upon request to an application by the FLAC maintainers. The remainder is of the block is defined by the registered application. Visit the registration page if you would like to register an ID for your application with FLAC.
PADDING: This block allows for an arbitrary amount of padding. The contents of a PADDING block have no meaning. This block is useful when it is known that metadata will be edited after encoding; the user can instruct the encoder to reserve a PADDING block of sufficient size so that when metadata is added, it will simply overwrite the padding (which is relatively quick) instead of having to insert it into the right place in the existing file (which would normally require rewriting the entire file).
SEEKTABLE: This is an optional block for storing seek points. It is possible to seek to any given sample in a FLAC stream without a seek table, but the delay can be unpredictable since the bitrate may vary widely within a stream. By adding seek points to a stream, this delay can be significantly reduced. Each seek point takes 18 bytes, so 1% resolution within a stream adds less than 2k. There can be only one SEEKTABLE in a stream, but the table can have any number of seek points. There is also a special 'placeholder' seekpoint which will be ignored by decoders but which can be used to reserve space for future seek point insertion.

Just after the above description, there's also the specification of the format of each of those blocks. The link also says

All numbers used in a FLAC bitstream are integers; there are no floating-point representations. All numbers are big-endian coded. All numbers are unsigned unless otherwise specified.

So, what are you missing? You say