iPhone 没有网络也能识别大词汇量语音吗?

发布于 2024-11-25 06:26:57 字数 179 浏览 3 评论 0原文

我用的是Openears,它需要字典。当我们在字典中提及该单词时,它很有用。我想转换我们所说的所有单词。所以我使用Nuance的语音来识别dragaon SDK。但它与网络服务器通信。出于安全考虑,我想避免服务器通信。是否可以将我们所说的所有单词的语音转换为文本,就像在 Windows Mobile 中一样,而无需仅在离线模式下与服务器通信?

I used Openears which needs dictionary. It is usefull when we mention the word in dictionary. I wanted to convert all words we speak. So I used Nuance’s speech to recognition dragaon SDK. But it communicates with webserver. I want to avoid server communication because of security concerns. Is it possible to convert speech to text for all words we speak as it is in windows mobile without communicating server only in offline mode?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

尘世孤行 2024-12-02 06:26:57

具有无限词汇量的语音识别需要非常大的计算和内存资源(千兆字节内存),因此在 iPhone 或其他嵌入式设备上很难做到这一点。 iPhone 比台式机慢 9 倍。 iPad 更容易,因为它有更强大的 CPU。

谷歌已经付出了很大的努力来让他们的引擎离线工作以进行听写,但它仍然更喜欢将数据发送到服务器,因为它的准确度要高得多。

因此,大多数在小型设备上运行的解决方案使用的词汇量有限。尽管这个词汇量可能足够大,所以你不会注意到这一点。通常500-1000字就足以涵盖大多数实际情况。您可以使用 OpenEars 来识别此类词汇。

要训​​练语言模型,您需要来自您的领域的文本(单词和表达)。 CMUSphinx 教程中介绍了语言模型训练。要使用语言模型,您可以使用以下 OpenEars API 调用:

- (void) changeLanguageModelToFile:     (NSString *)    languageModelPathAsString
withDictionary:     (NSString *)    dictionaryPathAsString 

有关更多详细信息,请参阅 API 参考

您可以将 OpenEars 与此类词汇表和相应的语言模型结合使用,以支持您的设备的自由格式文本输入。

Speech recognition with unlimited vocabulary requires very big computational and memory resources (gigabytes of memory) and thus it's very hard to do that in iPhone on other embedded device. iPhone is 9 times slower than desktop. iPad is easier since it has more powerful CPU.

Google has put very big effort to make their engine work offline for dictation, and still it prefers to send data to the server because it is significantly more accurate.

Because of that most of the solutions running on small devices use limited vocabulary. Though this vocabulary can be large enough so you will not notice that. Usually 500-1000 words is enough to cover most practical situations. You can use OpenEars to recognize such vocabulary.

To train a language model you need texts from your domain (words and expressions). Language model training is described in CMUSphinx tutorial. To use language model you can use the following OpenEars API call:

- (void) changeLanguageModelToFile:     (NSString *)    languageModelPathAsString
withDictionary:     (NSString *)    dictionaryPathAsString 

See API reference for more details.

You can use OpenEars with such vocabulary and corresponding language model to support free form text entry for your device.

就此别过 2024-12-02 06:26:57

这是可以完成的,但如果您正在寻找无限词汇量的语音到文本转换器,那么最好是在服务器上完成计算。对于智能手机这样的系统来说,这种系统的要求可能太高了。您将有巨大需求的主要领域如下:

  1. 将输入语音映射到文本的字典。
  2. 要运行的语音识别算法的计算。

我相信这就是像谷歌这样的公司通过服务器而不是电话运行语音识别服务的原因。

但如果应用程序是有限单词语音到文本,那么可能值得一试。

一切顺利!

It could be done, but if you are looking for an unlimited vocabulary speech to text convertor, then it is best if the computations are done on a server. The requirements for such a system are probably too great for a system such as a smartphone. The main areas where you will have huge requirements are as follows:

  1. Dictionary to map input speech into text.
  2. Computations for speech recognition algorithms to run.

I believe this is the reason why companies like Google run their speech recognition services over a server and not on the phone.

But if the application was a limited word speech to text, then it might be worth giving it a try.

All the best!

彼岸花ソ最美的依靠 2024-12-02 06:26:57

pocketsphinx 不能在没有网络连接的 iPhone 上运行吗?不是有一些演示应用程序像 VocalKit

http://www.rajeevan.co.uk/pocketsphinx_in_iphone/ 可能会有所帮助。

Doesn't pocketsphinx work on iPhone without network connectivity? Aren't there some demo apps floating around like VocalKit

http://www.rajeevan.co.uk/pocketsphinx_in_iphone/ may be helpful.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文