与模型的音频比较

发布于 2024-11-03 02:12:51 字数 408 浏览 1 评论 0原文

我希望能够用 Java 解决以下问题 - 因为它是我最有经验的语言,也是我的首选。

我希望能够建立一个声音模型 - 例如基于不同狗吠的 100 个声音样本的狗吠...一旦我有了这个样本,我希望能够从麦克风录制一个剪辑并进行处理它与模型进行比较,以确定记录的样本与模型足够接近的匹配概率,从而确定记录的声音是否是狗。

我的想法如下:

获取 100 只狗的傅里叶变换。

求 100 人的平均 FT - 这就是现在的模型。

录制声音剪辑 - 生成傅里叶变换。

从模型FT中扣除声音片段FT,看看它们如何比较?

我对音频没有丰富的经验 - 所以如果有人能告诉我这是否是正确的方法 - 使用什么 FFT 库 - 以及从 100 个样本构建平均 FT 的过程是什么 - 那太好了!

谢谢

I want to be able to solve the following problem in Java - as it is the language I am most experianced in and my preferred choice.

I want to be able to build a model of a sound - such as a dog barking based upon say 100 sound samples of different dogs barking... Once I have this sample I want to be able to record a clip from a microphone and process it against the model to determine the probability that the recorded sample matches closely enough to the model, to determine if the recorded sound was a dog.

I had the following in mind:

Get the Fourier Transforms of 100 dogs.

Get the average FT of the 100 - this is now the model.

Record the sound clip - generate Fourier Transform.

Deduct sound clip FT from model FT to see how they compare?

I am not hugely experienced with audio - so if anyone can tell me if this is the correct approach - what FFT library to use - and what the process is to build an average FT from 100 samples is - that would be great!

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

寂寞陪衬 2024-11-10 02:12:51

尽管我已经多次阅读有关 FT 的内容,但我自己从未专门使用过它们。

不过,我使用了 CoMIRVA 库。它实现了基于 FT 等技术来比较音乐 (www.cp.jku.at/comirva)。简而言之,它通过比较音色来比较两个“音频源”(http://en.wikipedia.org/wiki/Timbre)。当我使用它时,它在某些情况下效果很好,但在其他情况下效果不佳。然而,那是音乐。我不知道它是否适用于狗叫。

我建议您看一下它并阅读更多有关它实现的技术的信息。您可以在音频处理标题下找到更多详细信息。我建议您阅读这两份报告(Mandel 和 Ellis、Aucouturier 和 Pachet)。

祝你好运!

Even though I've read about FT's several times, I've never specifically used them myself.

However, I've used the library CoMIRVA. It implements techniques that is based on, among other things, FT's to compare music (www.cp.jku.at/comirva). In short, it compares two "audio sources" against each other by comparing the timbre (http://en.wikipedia.org/wiki/Timbre). When i used it it worked well in some cases, and not so well in other cases. However, that was with music. I've got no idea whether it will work with dog barking.

I suggest you have a look at it and read more about the techniques it implements. You'll find more details under the heading Audio Processing. I recommend you read both reports (Mandel and Ellis, Aucouturier and Pachet).

Good luck!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文