当前位置：文江博客话题详情

C# 语音识别 - 这是用户所说的吗？

发布于 2024-07-07 22:17:55 字数 336 浏览 11 评论 0原文

我需要编写一个使用语音识别引擎的应用程序（无论是内置的 vista 引擎还是第三方引擎），它可以显示单词或短语，并在用户读取它时进行识别（或其近似值））。我还需要能够在语言之间快速切换，而不改变操作系统的语言。

用户将使用该系统的时间很短。该应用程序无需首先根据用户的声音训练识别引擎即可运行。

如果这可以在 Windows XP 或较低版本的 Windows Vista 上运行，那就太棒了。

或者，系统需要能够以用户选择的语言将屏幕上的信息读回给用户。我可以使用预先录制的画外音来解决此规范，但首选方法是使用文本转语音引擎。

有人可以给我推荐一些东西吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

や莫失莫忘 2024-07-14 22:17:56

如果引擎是您要问的，那么我已经找到了（注意，我只是列出，我还没有尝试过其中任何一个）：

Lumenvox 引擎

您还拥有 SAPI SDK 来自微软本身，我只尝试过文本转语音，但根据其定义：

该 SDK 还包括可自由分发的文本转语音(TTS) 引擎（美国英语和简体中文）和语音识别 (SR) 引擎（美国英语、简体中文和日语）。

回复收藏 0 原文

长伴 2024-07-14 22:17:56

请注意，如果不首先进行培训，您将不会获得良好的结果。语音识别是语音学的统计应用，该领域非常坦率地承认信号的变化如此之大，以至于任何人都能理解其他人所说的话几乎是一个奇迹。现成的语音识别引擎很可能会倾向于更通用的英语口音，但对于任何轻微的不同都会严重失败。

这就是为什么培训如此重要。我们可以轻松地通过过度拟合来做得很好，特别是如果我们减少问题空间的话。但是创建一个可扩展的机器学习解决方案呢？问题始终存在于此。

话虽这么说，请考虑 Sphinx-4。这是一个用 Java 编写的现成解决方案，可在 http://cmusphinx.sourceforge.net/sphinx4/ 获取

回复收藏 0 原文

夜空下最亮的亮点 2024-07-14 22:17:56

查看 .NET 3.5 中的新语音类库

http://msdn.microsoft.com /en-us/library/system.speech.recognition.speechrecognizer.aspx

SR 和 TTS 的一般文档

http://msdn.microsoft.com/en -us/library/system.speech.recognition.aspx
http://msdn.microsoft.com/en-us/库/system.speech.synthesis.aspx

回复收藏 0 原文

我不吻晚风 2024-07-14 22:17:56

文本转语音可通过语音 API 实现。就我个人而言，我可能需要 Vista 并使用托管接口 System.Speech.SpeechRecognition 和 System. Speech.Synthesis.TtsEngine，但如果您确实需要 XP 支持，则 P/Invoke 应该可以进入非托管 API。

回复收藏 0 原文

可可 2024-07-14 22:17:56

尝试 Microsoft Speech Server，我认为它现在是 Office Communication Server 2007。它包含 SR/TTS 引擎、C# API 以及与 Visual Studio 集成的工具。

回复收藏 0 原文

终难愈 2024-07-14 22:17:56

这是来自 MSDN 杂志的文章，首次讨论了使用 Vista 的 System.Speech API。其中一些内容已经过时，因为 API 在 beta（撰写本文时）和 Vista 发布之间发生了变化，但这仍然是我找到的最好的资源之一，并且涵盖了对 System.Speech 命名空间的很好的介绍。请参阅 http://msdn.microsoft.com/en-us/magazine/cc163663 .aspx

回复收藏 0 原文

我不是你的备胎 2024-07-14 22:17:56

Dragon Naturally Talking SDK 可能值得一看。
这个项目看起来很有趣。

不过还没有和他们两个一起玩。

回复收藏 0 原文

我偏爱纯白色 2024-07-14 22:17:56

好吧，这个问题已经有很多很好的回答，但我认为用 2016 年文档中的一些信息更新 Rob Segal 和 Philipp Schmid 的回答很有价值，他们指出了这个很好的代码示例：

https://msdn.microsoft.com/en-us/library/office/system.speech.recognition .speechrecognitionengine.aspx

它没有使用Windows的共享识别器（显示在屏幕中间的小Windows麦克风），它使用了一个很好的应用程序中的SpeechRecognitionEngine，不需要任何视觉提示。用户界面完全由您控制。

回复收藏 0 原文

残龙傲雪 2024-07-14 22:17:55

不久前，Joel 在 Software 上也被问到了类似的问题。您可以使用 System.Speech.Recognition 命名空间来这样做......有一些限制。将 System.Speech（应位于 GAC 中）添加到您的项目中。下面是 WinForms 应用程序的一些示例代码：

public partial class Form1 : Form
{
  SpeechRecognizer rec = new SpeechRecognizer();

  public Form1()
  {
    InitializeComponent();
    rec.SpeechRecognized += rec_SpeechRecognized;
  }

  void rec_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
  {
    lblLetter.Text = e.Result.Text;
  }

  void Form1_Load(object sender, EventArgs e)
  {
    var c = new Choices();
    for (var i = 0; i <= 100; i++)
      c.Add(i.ToString());
    var gb = new GrammarBuilder(c);
    var g = new Grammar(gb);
    rec.LoadGrammar(g);
    rec.Enabled = true;
  }

它识别从 1 到 100 的数字，并在表单上显示结果数字。您需要一个带有名为 lblLetter 标签的表单。

System.Speech 仅适用于预定义的单词或短语列表；无论是在多功能性还是在识别质量方面，它都不完全是自然语言。但是您不必将其训练为用户的声音，并且如果您只有用户可以说的一些不同的内容，那么它的工作效果相当好。而且是免费的！（如果你有 Visual Studio）

如果你使用非常短的短语，它不会很好地工作；我为我的孩子制作了一个程序，让他说出字母表中的字母并在屏幕上看到它们，但效果并不好，因为许多字母听起来很相似（尤其是从一个四岁孩子的嘴里说出来）。

至于更灵活的选项......嗯，有前面提到的 NaturallySpeaking，它有一个 SDK。但你必须联系销售人员才能获得任何形式的访问权限，并且没有列出价格，因此它给人的印象是“它要花多少钱？那么，你有多少钱？”之一。之类的事情。似乎没有“下载并使用它”选项。 :(

至于文本转语音， System.Speech.Synthesis 比语音识别更容易。我编写了一个小程序，让我输入、按下 Enter 键并大声朗读文本：）（“爸爸。，我想跟 da wobot 说话。”）

A similar question was asked on Joel on Software a while back. You can use the System.Speech.Recognition namespace to do this...with some limitations. Add System.Speech (should be in the GAC) to your project. Here's some sample code for a WinForms app:

public partial class Form1 : Form
{
  SpeechRecognizer rec = new SpeechRecognizer();

  public Form1()
  {
    InitializeComponent();
    rec.SpeechRecognized += rec_SpeechRecognized;
  }

  void rec_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
  {
    lblLetter.Text = e.Result.Text;
  }

  void Form1_Load(object sender, EventArgs e)
  {
    var c = new Choices();
    for (var i = 0; i <= 100; i++)
      c.Add(i.ToString());
    var gb = new GrammarBuilder(c);
    var g = new Grammar(gb);
    rec.LoadGrammar(g);
    rec.Enabled = true;
  }

This recognizes the numbers from 1 to 100, and displays the resulting number on the form. You'll need a form with a label called lblLetter on it.

System.Speech only works with a pre-defined list of words or phrases; it's not exactly NaturallySpeaking, either in versatility or in recognition quality. But you don't have to train it to the user's voice, and if you only have a few different things the user can say, it works reasonably well. And it's free! (if you have Visual Studio)

It won't work well if you use very short phrases; I made a program for my kid to say letters of the alphabet and see them on-screen, but it doesn't do that well since many of the letters sound alike (especially from the mouth of a four-year-old).

As for more flexible options...well, there's the aforementioned NaturallySpeaking, which has an SDK. But you have to contact sales to get any sort of access to it, and no pricing is listed, so it comes across as one of those "How much does it cost? Well, how much have you got?" kind of things. There doesn't seem to be a "download and play around with it" option. :(

As for text-to-speech, System.Speech.Synthesis does this. It's even easier than the speech recognition. I wrote a small program to let me type, hit Enter, and read the text aloud. My four-year-old gets mesmerized by it. :) ("Daddy, I wanna tawk to da wobot.")

回复收藏 0 原文

暮凉 2024-07-14 22:17:55

[注：我是 .NET 3.0 中托管语音识别 API 的开发负责人]

System.Speech 是 .NET 3.0 的一部分，因此它在 Vista 和 XP 上都可用。在 Vista 中，您还有一个额外的好处：操作系统预装了语音识别引擎。在 XP 上，您的选择是：使用带有非常旧的引擎的 SAPI 5.1 SDK（但可能足以满足您的命令和控制场景），安装 Office 2003，它会安装较新版本的识别器。还有一些兼容 SAPI 5 的语音识别引擎可用。

如果您需要切换语言，您将需要使用 System.Speech.Recognition.SpeechRecognitionEngine 类，它允许您为需要支持的语言选择 SR 引擎。请注意，引擎是由它们支持的一组语言定义的（它们可能使用相同的二进制文件，仅交换数据文件以支持其他语言）。

如果您需要了解更多信息，请评论。

菲利普

回复收藏 0 原文

美男兮 2024-07-14 22:17:55

在此之前添加“语音”参考

System.Speech

发现 Kyralessa 在 10 月 22 日发布的代码示例不适用于我但稍微修改过的版本做到了。将字符串添加到 Choices 对象时，请使用全文英文单词而不是数字。看来微软的语音识别引擎不能自己识别数字。

我已在前面的示例中添加了一些注释来标记这些修改。

public partial class Form1 : Form
{
  SpeechRecognizer rec = new SpeechRecognizer();

  public Form1()
  {
    InitializeComponent();
    rec.SpeechRecognized += rec_SpeechRecognized;
  }

  void rec_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
  {
    lblLetter.Text = e.Result.Text;
  }

  void Form1_Load(object sender, EventArgs e)
  {
    var c = new Choices();

    // Doens't work must use English words to add to Choices and
    // populate grammar.
    //
    //for (var i = 0; i <= 100; i++)
    //  c.Add(i.ToString());

    c.Add("one");
    c.Add("two");
    c.Add("three");
    c.Add("four");
    // etc...

    var gb = new GrammarBuilder(c);
    var g = new Grammar(gb);
    rec.LoadGrammar(g);
    rec.Enabled = true;
  }

Before this add 'Speech' reference

System.Speech

Found that the code example posted by Kyralessa on Oct 22nd didn't work for me but a slightly revised version did. When adding strings into the Choices object use full text English words not numbers. Seems the MS speech recognition engine can't recognize numbers by themselves.

I have marked these modifications with some commenting added to the previous example.

public partial class Form1 : Form
{
  SpeechRecognizer rec = new SpeechRecognizer();

  public Form1()
  {
    InitializeComponent();
    rec.SpeechRecognized += rec_SpeechRecognized;
  }

  void rec_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
  {
    lblLetter.Text = e.Result.Text;
  }

  void Form1_Load(object sender, EventArgs e)
  {
    var c = new Choices();

    // Doens't work must use English words to add to Choices and
    // populate grammar.
    //
    //for (var i = 0; i <= 100; i++)
    //  c.Add(i.ToString());

    c.Add("one");
    c.Add("two");
    c.Add("three");
    c.Add("four");
    // etc...

    var gb = new GrammarBuilder(c);
    var g = new Grammar(gb);
    rec.LoadGrammar(g);
    rec.Enabled = true;
  }

回复收藏 0 原文

~没有更多了~