良好的语音识别API

发布于 2024-10-27 04:42:56 字数 252 浏览 1 评论 0原文

我正在 Windows 7 上开发一个语音识别项目,我正在使用 .net 附带的 system.speech API 包,我是在 C# 上进行的。

我面临的问题是听写识别不够准确。然后,每当我启动应用程序时,桌面语音识别都会自动启动。这对我来说是一个很大的麻烦。由于我所说的话已经不够清楚,并且冲突的识别被解释为正在执行诸如应用程序切换最小化之类的命令和动作。

除了微软的这个错误之外,你还能为我推荐任何好的语音 API 吗?哪怕只是听懂简单的听写语法就很好了。

I am developing a speech recognition project on Windows 7 and I'm using system.speech API package which comes along with .net and I am doing it on C#.

The problem I am facing is dictation recognition is not accurate enough. Then whenever I start my application the desktop speech recognition starts automatically. This is a big nuicance to me. As already the words I speak are not clear enough and conflicting recognition are interpreted as commands and actions like application switching minimize is being carried out.

Can you suggest any good speech API for me other than this Microsoft blunder? It will be good even if it can understand just simple dictation grammar.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

风透绣罗衣 2024-11-03 04:42:56

我认为桌面识别正在开始,因为您正在使用共享桌面识别器。您应该仅对您的应用程序使用 inproc 识别器。您可以通过在应用程序中实例化 SpeechRecognitionEngine() 来完成此操作。

由于您使用的是听写语法和桌面 Windows 识别器,我相信说话者可以对其进行训练以提高其准确性。完成 Windows 7 识别器训练,看看准确性是否有所提高。

要开始使用 .NET 语音,有一篇几年前发表的非常好的文章,网址为 http://msdn.microsoft.com/en-us/magazine/cc163663.aspx。这可能是迄今为止我发现的最好的介绍性文章。它有点过时,但非常有帮助。 (AppendResultKeyValue 方法在 Beta 版后被删除。)

这是一个快速示例,显示了使用我能想到的听写语法的最简单的 .NET Windows 窗体应用程序之一。这应该适用于 Windows Vista 或 Windows 7。我创建了一个表单。在上面放一个按钮并将按钮变大。添加了对 System.Speech 和行的引用:

using System.Speech.Recognition;

然后,我向 Button1 添加了以下事件处理程序:

private void button1_Click(object sender, EventArgs e)
{         
    SpeechRecognitionEngine recognizer = new SpeechRecognitionEngine();
    Grammar dictationGrammar = new DictationGrammar();
    recognizer.LoadGrammar(dictationGrammar);
    try
    {
        button1.Text = "Speak Now";
        recognizer.SetInputToDefaultAudioDevice();
        RecognitionResult result = recognizer.Recognize();
        button1.Text = result.Text;
    }
    catch (InvalidOperationException exception)
    {
        button1.Text = String.Format("Could not recognize input from default aduio device. Is a microphone or sound card available?\r\n{0} - {1}.", exception.Source, exception.Message);
    }
    finally
    {
        recognizer.UnloadAllGrammars();
    }                          
}

可以在 System.Speech.Recognition 和 Microsoft.Speech.Recognition 之间有什么区别?

I think desktop recognition is starting because you are using a shared desktop recognizer. You should use an inproc recognizer for your application only. you do this by instantiating a SpeechRecognitionEngine() in your application.

Since you are using the dictation grammar and the desktop windows recognizer, I believe it can be trained by the speaker to improve its accuracy. Go through the Windows 7 recognizer training and see if the accuracy improves.

To get started with .NET speech, there is a very good article that was published a few years ago at http://msdn.microsoft.com/en-us/magazine/cc163663.aspx. It is probably the best introductory article I’ve found so far. It is a little out of date, but very helfpul. (The AppendResultKeyValue method was dropped after the beta.)

Here is a quick sample that shows one of the simplest .NET windows forms app to use a dictation grammar that I could think of. This should work on Windows Vista or Windows 7. I created a form. Dropped a button on it and made the button big. Added a reference to System.Speech and the line:

using System.Speech.Recognition;

Then I added the following event handler to button1:

private void button1_Click(object sender, EventArgs e)
{         
    SpeechRecognitionEngine recognizer = new SpeechRecognitionEngine();
    Grammar dictationGrammar = new DictationGrammar();
    recognizer.LoadGrammar(dictationGrammar);
    try
    {
        button1.Text = "Speak Now";
        recognizer.SetInputToDefaultAudioDevice();
        RecognitionResult result = recognizer.Recognize();
        button1.Text = result.Text;
    }
    catch (InvalidOperationException exception)
    {
        button1.Text = String.Format("Could not recognize input from default aduio device. Is a microphone or sound card available?\r\n{0} - {1}.", exception.Source, exception.Message);
    }
    finally
    {
        recognizer.UnloadAllGrammars();
    }                          
}

A little more information comparing the various flavors of speech engines and APIs shipped by Microsoft can be found at What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition??

☆獨立☆ 2024-11-03 04:42:56

如果每个人都需要使用具有 Cortana 90% 准确度的语音识别引擎,则应遵循以下步骤。

步骤 1) 下载 Nugget 包 Microsoft.Windows.SDK.Contracts

步骤 2) 迁移到包引用 SDK --> https://devblogs.microsoft.com/nuget/migrate -packages-config-to-package-reference/

上述 SDK 将为您提供 Win32 应用程序中的 Windows 10 语音识别系统。必须这样做,因为使用此语音识别引擎的唯一方法是构建通用 Windows 平台应用程序。
我不建议在通用Windows平台上制作AI应用程序,因为它有沙箱。沙箱功能将应用程序隔离在容器中,它不允许它与任何硬件通信,它还会使文件访问变得绝对痛苦,并且不可能进行线程管理,只能使用异步函数。

步骤 3) 在命名空间部分添加此命名空间。该命名空间具有与在线语音识别相关的所有功能。

using Windows.Media.SpeechRecognition;

步骤4)添加语音识别实现。



Task.Run(async()=>
{
  try
  {
    
    var speech = new SpeechRecognizer();
    await speech.CompileConstraintsAsync();
    SpeechRecognitionResult result = await speech.RecognizeAsync();
    TextBox1.Text = result.Text;
  }
  catch{}
});

Windows 10 SpeechRecognizer 类中的大多数方法都需要异步调用,这意味着您必须在具有异步参数、异步方法或异步任务方法。

为了使其工作,请转到“设置”->“隐私->操作系统中语音并检查是否允许在线语音识别。

If everyone needs to use a speech recognition engine that has 90% of the accuracy of Cortana it should follow these steps.

Step 1) Download the Nugget package Microsoft.Windows.SDK.Contracts

Step 2) Migrate to the package reference the SDK --> https://devblogs.microsoft.com/nuget/migrate-packages-config-to-package-reference/

The above mentioned SDK will provide you with the windows 10 speech recognition system within Win32 apps. This has to be done because the only way to use this speech recognition engine is to build a Universal Windows Platforms application.
I don't recommend making an A.I. application in the Universal Windows Platform because it has sandboxing. The sandboxing function is isolating the app in a container and it won't allow it to communicate with any hardware and it will also make file access an absolute pain and thread management isn't possible, only async functions.

Step 3) Add this namespace in the namespace section. This namespace has all the functions that are related to online speech recognition.

using Windows.Media.SpeechRecognition;

Step 4) Add the speech recognition implementation.



Task.Run(async()=>
{
  try
  {
    
    var speech = new SpeechRecognizer();
    await speech.CompileConstraintsAsync();
    SpeechRecognitionResult result = await speech.RecognizeAsync();
    TextBox1.Text = result.Text;
  }
  catch{}
});

The majority of the methods within the Windows 10 SpeechRecognizer class require to be called asynchronously and this means that you must run them within a Task.Run(async()=>{}) lambda function with an async parameter, an async method or an async Task method.

In order for this to work go to Settings -> Privacy -> Speech in the OS and check if the online speech recognition is allowed.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文