给定一个 WAV 文件、其文件大小和采样率,是否可以计算采样数?
我们的应用程序需要知道它加载的音频文件的样本数。我们使用的库可以可靠地确定采样率,但不能确定样本数。我们是否可以仅根据文件大小和采样率来计算样本数?
Our app needs to know the sample count of the audio files it loads. The library we're using can reliably determine the sample rate, but not the sample count. Is it possible for us to calculate the sample count from just the file size and sample rate?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
马克说了什么。不,通常您需要解释标头。但是,如果格式、通道数和每个样本的位数已知并且所有文件都相同,理论上您可以根据文件大小进行计算。
WAV 是一种简单的格式,不幸的是,多年来大大小小的硬件和软件开发人员对该格式进行了许多奇怪的变体。通常,如果文件来自现代主流波形编辑器,您可以相信格式是合规的。因此,如果通过从 WaveLab 或类似工具导出样本来标准化,您可以节省为标头解释器编写(小)代码的工作。
最容易阅读的 .wav 格式说明位于此处。 StripWav是一个标准化样本的小程序;还有一个功能更强大的命令行工具:sox。 Sox 支持批处理作业,因此它比使用波形编辑器更好 - 假设 .wav 文件集是给定的而不是“动态的”。
所以:如果你可以通过 sox 批处理作业一劳永逸地标准化它们,那应该是可能的。我已经使用这种格式描述和 Sox 多次取得了很好的效果,祝你好运:)
What mark said. No, normally you need to interpret the header. But if the format, number of channels, and number of bits per sample are known and the same for all files you could theoretically calculate it from the file size.
WAV is a simple format, unfortunately there have been many strange variations of the format from big and small hardware and software developers over the years. Usually you can count on the format being kosher if the files are coming from a modern mainstream wave editor. So, if the samples are standardized by exporting them from WaveLab or similar, you could save writing the (small) code for the header-interpreter.
The easiest-to-read .wav format description is here. StripWav is a small program to standardize samples; there's also a command-line tool which is more capable: sox. Sox supports batch jobs, so it would be better than using a wave editor - assuming the set of .wav files is a given and not 'dynamic'.
So: If you can standardize them once and for all with a sox batch job, it should be possible. I've used this format description and Sox to great effect several times, good luck :)
假设WAV文件是PCM,您可以使用数据块的大小来计算它。每个样本的字节数就是每个样本的位数除以八。每个样本的位数将出现在 WAVEFORMAT 结构中。这可用于准确获得样本计数。
Assuming the WAV file is PCM, you can calculate it using the size of the data chunk. The number of bytes per sample is simply the number of bits per sample divided by eight. The number of bits per sample will be present in the WAVEFORMAT structure. This can be used to accurately get the sample count.
在 PCM wav 格式中,标头包含称为 blockalign 的信息,表示单个样本占用多少字节。
通常,如果您有一个标准 RIFF PCM wav 文件,没有附加任何元数据(通常情况)。 blockalign 是偏移量 32 处的 2 字节整数(从 wav 文件开头算起的第 33 到 34 个字节)。称为 datasize 的数据文件大小是偏移量 40 处的 4 字节整数(第 41 到 44 字节构成 wav 文件的开头)。
现在 datasize/blockalign 就是你想要的。
PS
如果你有一个更复杂的 wav 格式,如果它是 RIFF,格式信息和数据会被放入不同的“块”中(以及你可能不需要的其他一些块),并且会谈到偏移量上面可能不正确,那么你应该查看块。在您的情况下,您需要找到 fmt 和数据块。
每个块以称为 FOURCC 的 4 字节 ASCII 编码数据开始,“fmt”表示块包含格式信息,“data”表示数据块。 FOURCC 之后是一个 4 字节整数,告诉后面块的大小(以字节为单位)(FOURCC 和这 4 个字节不算)。
参考资料:
这里有一个简单的 wav 标头参考
这里有更通用的 RIFF wav 格式
In PCM wav format the header contains the information called blockalign of how many bytes a single sample takes.
Typically if you has a standard RIFF PCM wav file with no metadata attached to it(usual case). The blockalign is a 2-byte-integer at offset 32(the 33th to 34th bytes from the begining of the wav file). And the file size of data called datasize is a 4-byte integer at offset 40(the 41th to 44th bytes form the begining of the wav file) .
Now datasize/blockalign is what you want.
PS
In case you have a more complicated wav format, if it's RIFF, the format infomation and data are put into different "chunks" (along with some other chunks you may not need), and the offsets talked above may not correct, then you should look into chunks. In your case, you need to find fmt and data chunk.
Each chunks starts with a 4 byte ASCII coded data called FOURCC, 'fmt ' indicates that chunk include format information and 'data' indicates a data chunk. Right after FOURCC is a 4-byte-integer telling the size(in bytes) of the chunk after (FOURCC and this 4 bytes are not count).
References:
A simple wav header reference HERE
More general RIFF wav format HERE