让 WAV 文件转录与 Sphinx4 一起使用
我已经在 Windows XP 系统上安装了 Sphinx-4 并设置了 JSAPI。我想将英语口语 WAV(或 MP3)文件转录为文本。
当我运行 "WavFile" 演示时 - 它运行成功。
java -jar WavFile.jar
但是,当我像这样传递我自己的 wav 文件时:
java -jar WavFile.jar c:\test.wav
我得到:
加载“jar:file:/C:/sphinx4-1.0beta3-bin/sphinx4-1.0beta3/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/config.xml”中定义的识别器...
解码jar:file:/C:/sphinx4-1.0beta3-bin/sphinx4-1.0beta3/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/12345.wav 结果:一二三四五
这个演示似乎被设置为加载并运行内部 wav 文件(“12345.wav”)或其他文件。
我已阅读文档,只是不知道如何设置“config.xml”,甚至不知道将其放置在哪个目录中。我只是想使用标准演示来运行简单的概念证明。
那么,问题是:如何运行 Sphinx4 程序来转录 wav 文件?
谢谢。
I've got Sphinx-4 installed on my windows XP system and JSAPI set up. I'd like to transcribe an English spoken WAV (or MP3) file to text.
When I run the "WavFile" demo - it runs successfully.
java -jar WavFile.jar
But, when I pass my own wav file like this:
java -jar WavFile.jar c:\test.wav
I get:
Loading Recognizer as defined in 'jar:file:/C:/sphinx4-1.0beta3-bin/sphinx4-1.0beta3/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/config.xml'...
Decoding jar:file:/C:/sphinx4-1.0beta3-bin/sphinx4-1.0beta3/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/12345.wav
Result: one two three four five
It seems this demo is setup to load and run an internal wav file ("12345.wav") or something.
I've read the docs and just can't figure how to setup the "config.xml" or even what directory to place it in. I'm just trying to get a simple proof of concept running using the standard demos.
So, the question is: how do I run a Sphinx4 program to transcribe a wav file?
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
需要的是编写一个新的应用程序(基于 Transcriber.java),该应用程序使用 CMU 词典(美式英语)而不是 Transcriber.jar 支持的数字。
很奇怪的是,Sphinx 没有附带如此有用的示例。
What's needed is to write a new application (based on Transcriber.java) that uses the CMU Dictionary (American English) instead of the numbers that Transcriber.jar supports.
It is quite strange that Sphinx does not come with such a useful sample.
我知道这是一个非常旧的线程,但我只是想指出你的示例似乎运行得很完美。如果您查看输出的最后部分:
Decoding jar:file:/C:/sphinx4-1.0beta3-bin/sphinx4-1.0beta3/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/12345 .wav 结果:一二三四五<========== WAV 音频解码结果!
I know this is a super old thread, but I just wanted to point out that your example seems to have ran perfectly. If you look at the very end of your output:
Decoding jar:file:/C:/sphinx4-1.0beta3-bin/sphinx4-1.0beta3/bin/WavFile.jar!/edu/cmu/sphinx/demo/wavfile/12345.wav Result: one two three four five <========== RESULTS FROM DECODING WAV AUDIO!
查看 pocketsphinx 包。它是用 C 编写的,已针对每个平台进行编译,并且可以用作命令行或应用程序的一部分。我一直在使用它进行命令行工作,它的用途非常广泛。
Look at the pocketsphinx package. It's written in C, has been compiled for every platform, and can be used as a commandline or as part of an app. I have been working command line with it and it is extraordinarily versatile.