Julius 的语音识别。如何制作.voca文件?

发布于 2024-09-14 01:39:03 字数 928 浏览 8 评论 0原文

我正在制作一个语音识别系统,朱利叶斯在这项工作中取得了不错的成绩。 示例 .voca 文件中的单词可以完美识别,但如何将自己的单词和转录放入文件中?

我已经尝试过 VoxForge (http://www.voxforge.org/) 的最新版本和每晚构建声学模型及其词汇表,但我在 julius 开始时遇到了很多错误,如下所示:

Error: voca_load_htkdict: line 19: triphone "r-d+v" not found
Error: voca_load_htkdict: line 19: triphone "d-v+aa" not found
Error: voca_load_htkdict: the line content was: 2   [AARDVARK]  aa r d v aa r k
Error: voca_load_htkdict: begin missing phones
Error: voca_load_htkdict: r-d+v
Error: voca_load_htkdict: d-v+aa
Error: voca_load_htkdict: end missing phones
Error: init_voca: error in reading /usr/src/custom/julius/quickstart/grammar/sample.dict
ERROR: failed to read dictionary "/usr/src/custom/julius/quickstart/grammar/sample.dict"
ERROR: m_fusion: some error occured in reading grammars
ERROR: Error in loading model

有人知道 .voca 文件的单词转录规则吗?

I'm making a voice recognition system and Julius shows not bad results in this work.
Words from sample .voca file are recognizing perfectly but how to place own words and transcriptions to the file?

I've tried VoxForge (http://www.voxforge.org/) last release and nightly builds for acoustic models with their vocabulary but I've got a lot a lot errors at julius start like this:

Error: voca_load_htkdict: line 19: triphone "r-d+v" not found
Error: voca_load_htkdict: line 19: triphone "d-v+aa" not found
Error: voca_load_htkdict: the line content was: 2   [AARDVARK]  aa r d v aa r k
Error: voca_load_htkdict: begin missing phones
Error: voca_load_htkdict: r-d+v
Error: voca_load_htkdict: d-v+aa
Error: voca_load_htkdict: end missing phones
Error: init_voca: error in reading /usr/src/custom/julius/quickstart/grammar/sample.dict
ERROR: failed to read dictionary "/usr/src/custom/julius/quickstart/grammar/sample.dict"
ERROR: m_fusion: some error occured in reading grammars
ERROR: Error in loading model

Anyone knows the rules of word transcription for .voca files?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

◇流星雨 2024-09-21 01:39:04

错误原因:
当您的单词词典包含未在声学模型中训练的单词时,julius optput 这些消息,因为“voca_load_htkdict.c”尝试将 dict 文件中的三音素与声学模型中的三音素列表进行匹配,因此当它找不到它时,它显示此错误并停止程序。

可能出现的错误解决方案:
1.启用-forcedict选项或取消注释jconf文件以跳过字典中的错误单词并强制运行。
或者..
2.将“未找到的三音素”映射到hmmlist文件“tiedlist”中最接近的物理三音素。
例如:
b-ey+t v-eh+t
第一列是 triphone 的名称(从你的字典生成),第二列是你的 AM 中实际定义的 HMM 的名称。

但如果“未找到的三音素”很少,而不是太多,则可以完成此解决方案。

  1. 最好的解决方案是不要在 dict 文件中包含 AM 中没有的单词
    请注意,前两个解决方案仅用于测试 julius,因为对于生产或商业项目,您必须使用相同的语料库训练声学模型和语言模型。

error reason:
julius optput these messages when your word dictionary contains words that are not trained in the Acoustic Model because the "voca_load_htkdict.c" tries to match the triphones in dict file with the triphone list in Acoustic Model, so when it does not find it, it shows this error and stops the program.

possible error solutions:
1. enable -forcedict option or uncomment it jconf file to Skip error words in dictionary and force running.
or..
2. map the "not found triphone" to the most close physical triphone in hmmlist file "tiedlist".
for example:
b-ey+t v-eh+t
The first column is the name of triphone (generated from your dictionary), and the second column is the name of the HMM actually defined in your AM.

but this solution can be done if the "not found triphones" are little, not too many.

  1. the best solution is to not to include words in your dict file that are not in the A.M
    note that the first two solutions are for testing julius only because for production or comercial projects you must train the acoustic model and language model with the same corpus.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文