我想开始语音识别和语音合成
在基于语音识别的原型
有人告诉我使用微软语音服务器(sdk等),
当我有这个时,我如何编写应用程序,以及使用哪种编程语言(开发环境?)?
有人使用过 asterisk 或 SVOX 吗?
我需要做:
语音识别
语音合成
我不必是一个非常好的语音识别 - 我认为 30 -50 个单词对于开始来说应该足够了。
我正在使用Windows。
提前致谢
i want to get started with speech recognition and speech synthesis
in Prototype based on speech recognition
somebody told me to use microsoft speech server (sdk and so on)
when i have this, how do i programm an application, and with which programming language (development enviroment?)?
has someone experience with asterisk or SVOX?
i need to do:
speech recognition
speech synthesis
i doesn't have to be a very good speech recognition - i think 30 -50 words should be enough for the beginning.
i'm working with windows.
thanks in advance
发布评论
评论(2)
如果您选择使用 Microsoft 语音引擎,则可以使用 .NET 框架 API。正如我在另一篇文章中提到的,有两个命名空间(用于桌面使用的 System.Speech 和用于服务器使用的 Microsoft.Speech)。您可以使用任何 .NET 语言进行编程,并且可以使用 Visual Studio。
几年前在 http://msdn 上发表了一篇非常好的文章.microsoft.com/en-us/magazine/cc163663.aspx。这可能是迄今为止我发现的最好的介绍性文章。但是,它基于 WinFX API 的预发行版本,并且 System.Speech 类在 Vista 发布时发生了更改。由于这些重大 API 更改,本文中的示例无法编译,而且我没有找到任何更新或勘误表来解释这一点。在 Internet 上搜索方法名称“AppendResultKeyValue”,您会发现一些论坛帖子,例如 http://www.ms-news.net/f3012/system-speech-writing-changes-3025734.html 人们遇到了同样的问题。
这仍然是一篇很好的介绍性文章,非常值得一读。通过一点点黑客攻击,您就可以使示例应用程序正常运行。
If you choose to use the Microsoft Speech Engine, there are .NET framework APIs. As I mentioned in the other post, there are two namespaces (System.Speech for desktop use and Microsoft.Speech for server use). You can program in any .NET language and you can use Visual Studio.
There is a very good article that was published a few years ago at http://msdn.microsoft.com/en-us/magazine/cc163663.aspx. It is probably the best introductory article I’ve found so far. However, it was based on a prerelease version of the WinFX API and the System.Speech classes were changed when Vista was released. The samples in the article do not compile because of these breaking API changes and I have not found any updates or errata to explain this. Search the Internet for the method name “AppendResultKeyValue”, you’ll find a few forum posts like http://www.ms-news.net/f3012/system-speech-breaking-changes-3025734.html where people ran into this same problem.
It is still a good introductory article and well worth reading. with a little bit of hacking, you can get the sample app working.
当我学习计算语言学时,选择的工具是 Praat,这是一个非常混乱的原型工具,让您可以做任何与语音相关的事情。
我不认为它有任何外部 API,但它的内部脚本语言对于基本应用程序来说已经足够了,而且它有很多内置函数。对于理论和算法的“入门”来说,这还不错。
When I studied computational linguistics the tool of choice was Praat, a horribly confused prototyping tool that lets you do just about anything speech-related.
I don't think it has any external API, but its internal scripting language is sufficient for rudimentary applications, and it has lots of built-in functions. For "getting started" on theory and algorithms it is not too bad.