依赖于说话者的语音识别引擎与 sdk
我想做一个小应用程序,有谁知道一个好的依赖于说话者的语音识别引擎和 SDK。 (不是语音到文本引擎)
谢谢你,
Efrat
I want to do a little apllication, does any one know of a good speaker dependent speech recognition engin with sdk. (not speech to text engins)
thank you,
Efrat
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
Sphinx 可能符合您正在寻找的内容。它是一个开源语音识别平台,也是卡内基梅隆大学正在进行的一个项目。
Sphinx is probably along the lines of what you're looking for. It's an open-source speech recognition platform, and an ongoing project at Carnegie-Mellon University.
我使用 sphinx-4 得到了 82.25% 的准确率。我正在研究如何将其提高到 95% 以上。我只转录一个人的声音,因此如果依赖于说话者的系统会有所帮助,那就太好了。词汇量约为40,000个单词。我有一个双核系统,我可以轻松运行 sphinx-train 和 sphinx4 解码器,尽管训练器需要一天的时间来训练我拥有的 40 小时的音频,并且解码器是实时的。
我想知道是否有一个产品/开源库可以用来提高我的准确率。
谢谢,
陀罗尼
I used sphinx-4 to arrive at an acccuracy of 82.25%. I am figuring out how to increase it greater than 95%. I am transcribing only one persons voice so if a speaker dependent system will be helpful that will be great. vocabulary is around 40,000 words. I have a dual core system and i could easily run sphinx-train and sphinx4 decoder though the trainer takes a day to train 40 hours of audio that i have and the decoder is realtime.
I want to know if there is a product / open-source library that i can make use of to increase my accuracy percentage.
Thanks,
Dharani
有依赖于扬声器的引擎,它们更原始,就像手机的分配一样。它们不会尝试转换为文本,它们只是进行信号比较。这就是我所需要的。
there are speaker dependent engins, they are more primitive, like allot of the cellphones have. the do not attempt to convert to text, they just do signal comparison. and that what I need.
请详细说明。什么平台?词汇量有多大?有哪些性能限制?连续的?半连续? “不是语音到文本引擎”是什么意思?
如果您需要简单而小的东西,您可能需要尝试 EARS 用 C 编写,不是很大,可能适合初学者。
Please elaborate. What platform? What size vocabulary? What performance constraints? Continuous? Semi-continuous? What do you mean by "not speech to text engine"?
If you need something simple and small, you might want to try EARS written in C and not very big, probably good for beginners.