检测仅与声音文件的特定部分相关的样本数据
我想提取与声音剪辑的某个区域相关的样本字节数据,例如声音剪辑中的单词,这样我就可以获得仅与特定单词相关的样本数据集合,然后我可以通过它发送快速傅里叶变换。我如何能够从整个声音文件的字节集合中识别出这个数据集合?文件中的一些字节数据在转换为 2 字节值后看起来像这样,因为它是 16 位声音文件(44100Hz 15 秒)。
49150.0
43010.0
15622.0
58886.0
19460.0
35583.0
0.0
7930.0
507.0
2303.0
59897.0
39419.0
517.0
6663.0
9989.0
13055.0
9210.0
我知道这些数据位于时域中,并且我没有看到数据有任何重大变化,例如用于识别静音的 0 集合。我是否能够在时域中执行此操作,或者是否必须将这些数据带到频域,然后过滤掉不必要的数据并执行反向 FFT 以获得有意义的数据集合。提前致谢。
I want to extract sample byte data that is related to a certain area of a sound clip like, a word in a sound clip, so that I get a collection of sample data that is related only to the particular word which then I can send through a FFT. How will I be able to identify this collection of data from a collection of bytes that are there for the whole sound file? Some of the byte data from the file looks like this after converting them to 2 byte values because its a 16 bit sound file (44100Hz 15 sec).
49150.0
43010.0
15622.0
58886.0
19460.0
35583.0
0.0
7930.0
507.0
2303.0
59897.0
39419.0
517.0
6663.0
9989.0
13055.0
9210.0
I am aware that this data is in the time domain and I am not seeing any significant changes in data like a collection of 0’s to identify silence. Will I be able to do this in the time domain or would I be having to take this data to the frequency domain and then filter the unnecessary data and do a reverse FFT to get a collection of data that make sense. Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
实现此目的的一种方法(也许是最简单的)是将声音文件加载到音频编辑应用程序中,该应用程序可让您设置选择的起点和终点,然后只需聆听并移动选择点,直到听到您想要的内容。试图找到软件算法可用的这些端点的准确且可靠的描述是一个更加困难的问题。
One way to do this, perhaps the easiest, is to load the sound file into an audio editing application that lets you set the start and end points of a selection, and just listen and move the selection points until you hear what you want. Trying to find an accurate and robust description of those end points that is usable by a software algorithm is a much more difficult problem.