用于嵌入式应用的语音识别引擎
我正在尝试研究可用的语音识别引擎和 SDK,用于开发 Windows CE 语音应用程序。我遇到过 Nuance,但没有看到其他任何东西。如果可能的话,我更喜欢 .Net SDK,但我想大多数都是 C/C++。我很感激任何建议。谢谢。
I am trying to research available voice recognition engines and SDK for developing a Windows CE voice enabled application. I've run across Nuance, but don't see much of anything else. I would prefer a .Net SDK if possible, but I imagine most would be C/C++. I appreciate any suggestions. Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
Nuance 基本上已经买下了所有人。恐怕他们统治着语音市场……
还有一些其他公司从事这项技术,但我不知道他们在嵌入式市场做得如何。有 telisma 和 Loquendo,两者都有很强的非英语存在感(而且他们的英语也不错)。
然后还有IBM。他们有 ViaVoice 嵌入式。
业界正在等待的一件大事就是看看微软收购 TellMe 会产生什么结果,但是我认为他们可能会远离嵌入式市场,而不是将处理推向“云”,而这正是 TellMe 长期以来一直在的地方。
Nuance has basically bought everyone up. They rule the speech market, I am afraid...
There are a few other companies that deal in the technology, but I don't know how well they do in the embedded market. There is telisma and Loquendo, both which have strong non-English presences (and their English isn't too bad either).
Then there is still IBM. They have ViaVoice Embedded.
One of the big things the industry is waiting for is to see what comes out of Microsoft's acquisition of TellMe, but I think the embedded market they might stay away from instead of pushing the processing to the "cloud", which is where TellMe has been for a long time.
我使用 IVR 应用程序;除了 Nuance 之外,我们目前还在评估 Microsoft、IBM 和 Lumenvox。
大多数手机上包含的语音识别应用程序旨在将语音输入与先前所说的短语进行匹配,例如将短语“Joe”分配给地址簿条目,并在您说“Joe”时让手机拨打该地址簿条目。更强大的语音识别引擎尝试通过将短语分解为音素来破译自由形式的语音,并且然后与声学存储库进行匹配,尝试找出实际所说的内容。一个成熟的语音识别引擎需要相当大的 CPU 处理能力;要在移动设备上使用语音识别执行任何复杂的操作,您可能需要将数据从设备发送到服务器进行处理。
I work with IVR applications; in addition to Nuance we're currently evaluating Microsoft, IBM, and Lumenvox.
The voice recognition applications included on most cell phones are designed to match voice input to a previously spoken phrase, such as assigning the phrase "Joe" to an address book entry and having your phone dial that address book entry when you say "Joe". The more powerful speech recognition engines try to decipher freeform speech by breaking a phrase down into phonemes, and then matching against an acoustic repository to try to figure out what was actually said. A full blown speech recognition engine requires a fair amount of CPU horsepower; to do anything complex with voice recognition on a mobile device, you'll probably need to send data from the device to a server for processing.
尝试查看 Microsoft 的语音 API,http://msdn.microsoft.com/en -us/library/ms897381.aspx
我相信它可以在 CE 设备上运行。
Try looking into Microsoft's Speech API, http://msdn.microsoft.com/en-us/library/ms897381.aspx
I believe it runs on CE devices.
还有开源项目 CMU Sphinx 。他们有一个名为 PocketSphinx 的变体,专门针对便携式设备。
There is also the open source project CMU Sphinx . They have a variant called PocketSphinx that has been targeted for portable devices.
正如我上面的评论之一所述,我们正在尝试 Vangard Voice Systems 的语音识别 .Net SDK。它使用 Nuance 的 Vocon3200 语音识别引擎,该引擎备受推崇,并且在早期测试中似乎运行良好。我们现在使用的是廉价麦克风,并且存在一些外部噪音问题。希望降噪耳机能够解决这个问题。该软件模型有点缺乏,因为它基本上与现有的非语音应用程序挂钩。由于这一事实,存在一些限制,并且开发人员可以访问的 API 也有限。每当您尝试过度简化类似的事情时,都会使制定强大的解决方案变得更加困难。话虽如此,我们确实找不到任何竞争产品可以满足我们对 .Net SDK 移动应用程序语音支持的需求。他们目前已经开辟了一个不错的小利基市场。
我更愿意使用 Nuance 的 C++ SDK(另一家公司为其编写了 .Net 包装器),但 Nuance 业务模型假设我们正在开发一款用于转售的产品,并且涉及一些重要的特许权使用费。对于想要开发内部应用程序的公司来说,这是一个真正的障碍。
As stated in one of my comments above, we are trying a voice recognition .Net SDK from Vangard Voice Systems. It uses Nuance's Vocon3200 voice recognition engine which is well respected and seems to work well in early testing. We're using a cheap microphone right now and have some issues with outside noise. Hopefully that will be resolved with noise-cancelling headsets. The software model is a bit lacking in that it basically hooks into an existing non-voice application. There are some limitations due to this fact and there is a limited API accessible by the developer. Any time you try to oversimplify something like this, you make crafting a powerful solution much more difficult. With that being said, we really couldn't find any competing product that serves our needs of a .Net SDK for voice enablement of mobile applications. They currently have a nice little niche carved out.
I would have preferred to go with Nuance's C++ SDK (for which another company has written .Net wrappers), but the Nuance business model assumes we're developing a product for resale and has some significant royalties involved. A real barrier for a company that wants to develop internal applications.