需要语音识别方面的建议吗?

发布于 2024-11-18 11:15:57 字数 289 浏览 2 评论 0原文

我开发了一个使用 SAPI 5.1 将语音转换为文本的应用程序。

由于准确性太弱,我决定创建自己的语法,我创建了自己的语法,它只识别从一到十的数字。

我的准确性再次失败。所以我深入研究了语法文件。我浏览了用于发音的 Lexion File。所以我的问题是

  1. 词典文件会改进 准确性?这样我就可以使用 数字一到十的发音 在 Lexicon 文件中,然后 使用它。

  2. 我需要一个关于如何操作的模板 创建一个词典文件。

I developed an application which converts from voice to text using SAPI 5.1.

As the accuracy is too weak, I decided to create my own grammar, I created my own grammmar which only recognizes numbers from one to ten.

I failed in accuracy again. So I went in deep with the grammar file. I went through Lexion File which is used for pronunciation. So my question is

  1. will lexicon file improve the
    accuracy? so that I can use
    pronunciation of numbers one to ten
    in the Lexicon file and then
    use it.

  2. I need a template on how
    to create a lexicon file.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

绝不服输 2024-11-25 11:15:57

如果您的语音识别准确度较弱,可能是以下任一原因造成的:

  1. 训练数据不足 - 请注意,创建依赖于说话者的语音识别系统(仅与一个说话者相关)需要大量的每个单词的单位(在您的情况下是一到十)。需要单独的单元来训练初始模型,然后可能需要嵌入训练数据来进一步改进模型。

  2. 独立于说话者的语音识别模型将需要更多的数据。

  3. 测试数据和训练数据之间不匹配。如果模型是使用无噪声数据或带口音的数据创建的,则在使用具有大量噪声或具有不同口音的数据进行测试时可能很难获得良好的结果。

但有关您正在尝试构建的语音识别系统的更多详细信息会更好。

更新 1:由于您在评论中提到您正在使用 Microsoft Speech SDK,因此这里是训练语音的指南关于声音/口音的 SDK。只需按照说明进行操作即可。

If your speech recognition accuracy is weak, it could be any one of the following reasons:

  1. Not enough training data - note that creating a speaker-dependant speech recognition system (that is tied to only one speaker) requires a large number of units of each of the words (one to ten in your case). Individual units are required for training initial models with and then embedded training data maybe required to further improve the models.

  2. A speaker-independent speech recognition model will require even more data.

  3. There is a mismatch between the testing and training data. If the models were created using noise-less data or on data with an accent, it may be difficult to get good results when testing with data that has a lot of noise or has a different accent.

But more details about the speech recognition system you are trying to build would be better.

Update 1: Since you mention in the comments that you are using Microsoft Speech SDK, here is a guide to training the speech SDK on sounds/accents. Just follow the instructions and that should set you on your way.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文