Java - 下采样wav音频文件
您好,我需要将 wav 音频文件的采样率从 44.1kHz 降采样到 8kHz。我必须使用字节数组手动完成所有工作......这是出于学术目的。
我目前正在使用 2 个类(Sink 和 Source)来弹出和推送字节数组。一切都很顺利,直到我到达需要使用线性插值对数据块进行下采样的部分。
由于我将采样率从 44100 Hz 降采样到 8000 Hz,因此如何插入包含 128 000 000 字节之类的字节数组?现在,我根据 i%2 == 0、i%2 == 1 和 i%80 == 0 弹出 5、6 或 7 个字节,并将这 5、6 或 7 个字节的平均值推入新文件中。
结果确实是一个比原始文件更小的音频文件,但它无法在 Windows Media Player 上播放(说读取文件时出现错误)并且有很多噪音,尽管我可以听到噪音后面的正确曲目。
所以,总而言之,我需要有关线性插值部分的帮助。提前致谢。
Hi I need to downsample a wav audio file's sample rate from 44.1kHz to 8kHz. I have to do all the work manually with a byte array...it's for academic purposes.
I am currently using 2 classes, Sink and Source, to pop and push arrays of bytes. Everything goes well until I reach the part where I need to downsample the data chunk using a linear interpolation.
Since I'm downsampling from 44100 to 8000 Hz, how do I interpolate a byte array containing something like 128 000 000 bytes? Right now I'm popping 5, 6 or 7 bytes depending on i%2 == 0, i%2 == 1 and i%80 == 0 and push the average of these 5, 6 or 7 bytes into the new file.
The result is indeed a smaller audio file than the original but it cannot be played on windows media player (says there is an error while reading the file) and there is a lot of noise although I can hear the right track behind the noise.
So, to sum things up, I need help concerning the linear interpolation part. Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为您不应该使用这些样本的平均值,因为这将是中值滤波器,而不是完全下采样。只需使用每个第 5/6/7 个样本并将其写入新文件即可。
这可能会有一些混叠伪像,但总体上可能是可识别的。
另一种更复杂但可能在质量方面具有更好结果的解决方案是首先使用 FFT 或 DFT 将样本转换为频率分布,然后以适当的采样率将其转换回来。我已经有一段时间没有做过这样的事情了,但这绝对是可行的。不过,您可能需要稍微调整一下才能使其正常工作。
另外,当不采用完整数组的 FT 而是分段时,您会遇到段边界为 0 的问题。几年前,当我玩这些东西时,我没有对此提出可行的解决方案(因为它也会生成工件),但如果您阅读正确的书籍,可能会有一个:-)
至于 WMP 抱怨该文件:您确实相应地修改了您编写的标头,对吧?
I think you shouldn't use the average of those samples as that would be a median filter, not exactly downsampling. Just use every 5th/6th/7th sample and write that to the new file.
That will probably have some aliasing artifacts but might overall be recognizable.
Another, more complex solution but probably one with better results, quality-wise, would be to first convert your samples into a frequency distribution using a FFT or DFT and then convert it back with the appropriate sample rate. It's been a while since I have done such a thing but it's definitely doable. You may need to fiddle around a bit to get it working properly, though.
Also when not taking a FT of the complete array but rather in segments you have the problem of the segment boundaries being 0. A few years ago when I played with those things I didn't come up with a viable solution to this (since it generates artifacts as well) but there probably is one if you read the right books :-)
As for WMP complaining about the file: You did modify the header you write accordingly, right?