如何将人声转换为数字格式?

发布于 2024-10-19 13:16:00 字数 376 浏览 7 评论 0原文

我正在开展一个使用生物识别系统来保护系统的项目。我们计划使用人声来保护系统。

想法是让人们说出一些单词或句子,系统将以数字格式存储该语音。下次人们想要进入系统时,他/她必须说出一些单词,这些单词可能与之前使用的单词不同,也可能没有不同。

我们不想匹配单词,而是想要匹配语音频率。

我已经阅读了一些有关该系统的研究论文,但这些论文没有任何实现细节。

所以只想知道是否有任何软件/API 可以将模拟语音转换为数字格式,并且还会告诉我们语音的频率。

到目前为止,我一直在开发基于 Web 的普通应用程序,因此我了解普通的 API 和平台,如 Java EE、C# 等,但我对此类应用程序没有任何经验。

请赐教!!!

I am working on a project where biometric system is used to secure the system. We are planning to use human voice to secure the system.

Idea is to allow the person to say some words or sentences and system will store that voice in digital format. Next time person wants to enter the system, he/she has to speak some words which may or may not be different from the words used earlier.

We don't want to match words but want to match voice frequency.

I have read some research papers regarding this system but those papers don't have any implementation details.

So just want to know whether there is any software/API which can convert analog voice into digital format and will also tell us the frequency of voice.

Until now I was working on normal web based applications so I know normal APIs and platforms like Java EE, C#, etc but I don't have any experience about this kind of application.

Please enlighten !!!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

以往的大感动 2024-10-26 13:16:01

这是一个很好的起点: http://marsyas.info/

它是一个开源的音频软件框架加工。他们列出了一系列以各种方式使用其框架的项目,因此您可能会从中汲取灵感。
http://marsyas.info/about/projects。 Telligence 项目尤其似乎最接近您的需求,因为它用于对音频进行性别分类:http:// marsyas.info/about/projects#5Teligence

This is as good a starting point as any : http://marsyas.info/

It's a open source software framework for audio processing. They've listed a bunch of projects that have used their framework in various ways so you could probably draw inspiration from it.
http://marsyas.info/about/projects. The Telligence project in particular seems the closest to your needs as it it was used to gender classify audio : http://marsyas.info/about/projects#5Teligence

橙味迷妹 2024-10-26 13:16:01

我认为像这样的项目有两个步骤:

第一步是将模拟输入的语音录制为数字格式(假设为 wav-pcm)。为此,您可以使用 C# 中的 DirectShow API,或标准 Wav-In,如本项目所示: http://www.codeproject.com/KB/audio-video/cswavrec.aspx。您可能会考虑稍后压缩您的音频文件,有很多选项,在 Windows 中您可以考虑使用 Windows Media Format SDK 以避免其他格式的许可问题。

第二步是构建或使用语音识别框架,如果您想构建一个识别框架,您可能需要为您的声音片段定义一组“特征”并选择+实现识别算法。有许多方法可用于此目的,IEEE amd ACM.org 网站通常是很好的来源。如果您想使用现有框架,您可能需要考虑 Nuance Recognizer(商业版)或 http://cmusphinx.sourceforge.net< /a>(开源)。

希望这有帮助。

There are two steps on a project like this one I believe:

First step would be to record the voice from an analog input into digital format (let's assume wav-pcm). For this you can use DirectShow API in C#, or standard Wav-In as in this project: http://www.codeproject.com/KB/audio-video/cswavrec.aspx. You may consider compressing your audio files later on, there are many options for this, in Windows you may consider Windows Media Format SDK to avoid licensing issues with other formats.

Second step is to build or use a voice recognition framework, if you want to build a recognition framework you will probably need to define a set of "features" for your sound fragments and select+implement a recognition algorithm. There are many aproaches available for this, IEEE amd ACM.org websties are usually good sources. If you want to use an existing framework you may want to consider Nuance Recognizer (commercial) or http://cmusphinx.sourceforge.net (open source).

Hope this helps.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文