System.Speech.Recognition 和 Microsoft.Speech.Recognition 之间有什么区别?

发布于 2024-09-04 03:22:55 字数 847 浏览 4 评论 0原文

.NET 中有两个类似的用于语音识别的命名空间和程序集。我试图了解其中的差异以及何时适合使用其中之一。

System.Speech.Recognition 来自程序集 System.Speech(在 System.Speech.dll 中)。 System.Speech.dll是.NET Framework类库3.0及更高版本中的核心DLL。

还有来自程序集Microsoft.Speech(在microsoft.speech.dll中)的Microsoft.Speech.Recognition。 Microsoft.Speech.dll 是 UCMA 2.0 SDK 的一部分,

我发现文档令人困惑,我有以下问题:

System.Speech.Recognition 说它用于“Windows 桌面语音技术”,这是否意味着它不能在服务器操作系统或无法用于大规模应用程序?

UCMA 2.0 语音 SDK ( http://msdn .microsoft.com/en-us/library/dd266409%28v=office.13%29.aspx)表示它需要 Microsoft Office Communications Server 2007 R2 作为先决条件。然而,我在会议上被告知,如果我不需要 OCS 功能(例如状态和工作流程),我可以使用 UCMA 2.0 语音 API,而不使用 OCS。这是真的吗?

如果我正在为服务器应用程序构建一个简单的识别应用程序(假设我想自动转录语音邮件)并且我不需要 OCS 的功能,那么这两个 API 之间有什么区别?

There are two similar namespaces and assemblies for speech recognition in .NET. I’m trying to understand the differences and when it is appropriate to use one or the other.

There is System.Speech.Recognition from the assembly System.Speech (in System.Speech.dll). System.Speech.dll is a core DLL in the .NET Framework class library 3.0 and later

There is also Microsoft.Speech.Recognition from the assembly Microsoft.Speech (in microsoft.speech.dll). Microsoft.Speech.dll is part of the UCMA 2.0 SDK

I find the docs confusing and I have the following questions:

System.Speech.Recognition says it is for "The Windows Desktop Speech Technology", does this mean it cannot be used on a server OS or cannot be used for high scale applications?

The UCMA 2.0 Speech SDK ( http://msdn.microsoft.com/en-us/library/dd266409%28v=office.13%29.aspx ) says that it requires Microsoft Office Communications Server 2007 R2 as a prerequisite. However, I’ve been told at conferences and meetings that if I do not require OCS features like presence and workflow I can use the UCMA 2.0 Speech API without OCS. Is this true?

If I’m building a simple recognition app for a server application (say I wanted to automatically transcribe voice mails) and I don’t need features of OCS, what are the differences between the two APIs?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

风追烟花雨 2024-09-11 03:22:55

简而言之,Microsoft.Speech.Recognition 使用服务器版本的 SAPI,而 System.Speech.Recognition 使用桌面版本的 SAPI。

API 大部分相同,但底层引擎不同。通常,服务器引擎被设计为接受电话质量的音频以进行命令和操作。控制应用;桌面引擎旨在为命令和命令接受更高质量的音频。控制和听写应用程序。

您可以在服务器操作系统上使用 System.Speech.Recognition,但其设计的扩展性不如 Microsoft.Speech.Recognition。

不同之处在于,服务器引擎不需要训练,并且可以处理较低质量的音频,但识别质量比桌面引擎低。

The short answer is that Microsoft.Speech.Recognition uses the Server version of SAPI, while System.Speech.Recognition uses the Desktop version of SAPI.

The APIs are mostly the same, but the underlying engines are different. Typically, the Server engine is designed to accept telephone-quality audio for command & control applications; the Desktop engine is designed to accept higher-quality audio for both command & control and dictation applications.

You can use System.Speech.Recognition on a server OS, but it's not designed to scale nearly as well as Microsoft.Speech.Recognition.

The differences are that the Server engine won't need training, and will work with lower-quality audio, but will have a lower recognition quality than the Desktop engine.

莫多说 2024-09-11 03:22:55

我发现 Eric 的回答非常有帮助,我只是想添加一些我发现的更多详细信息。

System.Speech.Recognition 可用于对桌面识别器进行编程。产品中已附带 SAPI 和桌面识别器:

  • Windows XP:SAPI v5.1,无识别器
  • Windows XP Tablet Edition:SAPI v5.1 和 Recognizer v6.1
  • Windows Vista:SAPI v5.3 和 Recognizer v8.0
  • Windows 7:SAPI v5.4 和识别器 v8.0?

服务器带有 SAPI,但没有识别器:

  • Windows Server 2003:SAPI v5.1,没有识别器
  • Windows Server 2008 和 2008 R2:SAPI v5.3?并且没有识别器

桌面识别器也出现在 Office 等产品中。

  • Microsoft Office 2003:识别器 v6.1

Microsoft.Speech.Recognition 可用于对服务器识别器进行编程。服务器识别器已在产品中提供:

  • Speech Server(各种版本)
  • Office Communications Server (OCS)(各种版本)
  • UCMA – 这是 OCS 的托管 API,(我相信)包括可再发行的识别器
  • Microsoft Server Speech Platform – 识别器 v10.1。 2

Microsoft Server Speech Platform 10.2 版本的完整 SDK 位于 http://www.microsoft.com/downloads/en/details.aspx?FamilyID=1b1604d3-4f66-4241-9a21-90a294a5c9a4。语音引擎可免费下载。版本 11 现已发布:http://www.microsoft.com/download /en/details.aspx?id=27226

有关 Microsoft Speech Platform SDK 11 信息和下载,请参阅:

桌面识别器设计用于运行 inproc 或共享。共享识别器在桌面上非常有用,其中使用语音命令来控制任何打开的应用程序。服务器识别器只能在进程内运行。当单个应用程序使用识别器或需要识别 wav 文件或音频流时(共享识别器无法处理音频文件,只能处理来自输入设备的音频),请使用 Inproc 识别器。

仅桌面语音识别器包含听写语法(系统提供用于自由文本听写的语法)。 System.Speech.Recognition.DictationGrammar 类在 Microsoft.Speech 命名空间中没有补充。

您可以使用 API 来查询确定已安装的识别器

  • 桌面:System.Speech.Recognition.SpeechRecognitionEngine.InstalledRecognizers()
  • 服务器:Microsoft.Speech.Recognition.SpeechRecognitionEngine.InstalledRecognizers()

我发现我还可以查看安装了哪些识别器查看注册表项:

  • 桌面识别器: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Recognizers\Tokens
  • 服务器识别器: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech Server\v10.0\Recognizers\Tokens

--- 更新 ---

如 < a href="https://stackoverflow.com/questions/10002591/microsoft-speech-recognition-what-reference-do-i-have-to-add">Microsoft 语音识别 - 我必须添加什么参考?< /a>,Microsoft.Speech 也是用于 Kinect 识别器的 API。这在 MSDN 文章 http://msdn.microsoft.com/en-我们/library/hh855387.aspx

I found Eric’s answer really helpful, I just wanted to add some more details that I found.

System.Speech.Recognition can be used to program the desktop recognizers. SAPI and Desktop recognizers have shipped in the products:

  • Windows XP: SAPI v5.1 and no recognizer
  • Windows XP Tablet Edition: SAPI v5.1 and Recognizer v6.1
  • Windows Vista: SAPI v5.3 and Recognizer v8.0
  • Windows 7: SAPI v5.4 and Recognizer v8.0?

Servers come with SAPI, but no recognizer:

  • Windows Server 2003: SAPI v5.1 and no recognizer
  • Windows Server 2008 and 2008 R2: SAPI v5.3? and no recognizer

Desktop recognizers have also shipped in products like office.

  • Microsoft Office 2003: Recognizer v6.1

Microsoft.Speech.Recognition can be used to program the server recognizers. Server recognizers have shipped in the products:

  • Speech Server (various versions)
  • Office Communications Server (OCS) (various versions)
  • UCMA – which is a managed API for OCS that (I believe) included a redistributable recognizer
  • Microsoft Server Speech Platform – recognizer v10.2

The complete SDK for the Microsoft Server Speech Platform 10.2 version is available at http://www.microsoft.com/downloads/en/details.aspx?FamilyID=1b1604d3-4f66-4241-9a21-90a294a5c9a4. The speech engine is a free download. Version 11 is now available at http://www.microsoft.com/download/en/details.aspx?id=27226.

For Microsoft Speech Platform SDK 11 info and downloads, see:

Desktop recognizers are designed to run inproc or shared. Shared recognizers are useful on the desktop where voice commands are used to control any open applications. Server recognizers can only run inproc. Inproc recognizers are used when a single application uses the recognizer or when wav files or audio streams need to be recognized (shared recognizers can’t process audio files, just audio from input devices).

Only Desktop speech recognizers include a dictation grammar (system provided grammar used for free text dictation). The class System.Speech.Recognition.DictationGrammar has no complement in the Microsoft.Speech namespace.

You can use use the APIs to query determine your installed recongizers

  • Desktop: System.Speech.Recognition.SpeechRecognitionEngine.InstalledRecognizers()
  • Server: Microsoft.Speech.Recognition.SpeechRecognitionEngine.InstalledRecognizers()

I found that I can also see what recognizers are installed by looking at the registry keys:

  • Desktop recognizers: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Recognizers\Tokens
  • Server recognizers: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech Server\v10.0\Recognizers\Tokens

--- Update ---

As discussed in Microsoft Speech Recognition - what reference do I have to add?, Microsoft.Speech is also the API used for the Kinect recognizer. This is documented in the MSDN article http://msdn.microsoft.com/en-us/library/hh855387.aspx

小清晰的声音 2024-09-11 03:22:55

以下是语音库(MS Server 语音平台)的链接:

Microsoft Server 语音平台 10.1 已发布(26 种语言的 SR 和 TTS)

Here is the link for the Speech Library (MS Server Speech Platform):

Microsoft Server Speech Platform 10.1 Released (SR and TTS in 26 languages)

唱一曲作罢 2024-09-11 03:22:55

微软似乎写了一篇文章来澄清有关 Microsoft 语音平台和 Windows SAPI 之间差异的问题 - https://msdn.microsoft.com/en-us/library/jj127858.aspx
我在将 Kinect 的语音识别代码从 Microsoft.Speech 转换为 System.Speech 时发现了一个差异(请参阅 http://github .com/birbilis/Hotspotizer)是前者支持带有 tag-format=semantics/1.0-literals 的 SGRS 语法,而后者则不支持,您必须通过将 x 更改为 out= 来转换为语义/1.0 “x”;在标签处

Seems Microsoft wrote an article that clears things up regarding the differences between Microsoft Speech Platform and Windows SAPI - https://msdn.microsoft.com/en-us/library/jj127858.aspx.
A difference I found myself while converting Speech recognition code for Kinect from Microsoft.Speech to System.Speech (see http://github.com/birbilis/Hotspotizer) was that the former supports SGRS grammars with tag-format=semantics/1.0-literals, while the latter doesn't and you have to convert to semantics/1.0 by changing x to out="x"; at tags

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文