我有一个可以响应声音命令的应用程序。示例包括:阅读标题
、启动 Visual Srudio
。该应用程序通过 TTS 提供反馈。
我想将应用程序扩展为模块化。每个模块应该能够:
- 扩展已知命令的列表
- 打开 UI 窗口(父应用程序必须知道这些窗口,以便它可以以标准方式根据命令关闭它们)
- TTS 引擎的队列文本 - 可能带有优先级标志
- 支持来回对话 -
显示布里斯托尔地图
可能会引起布里斯托尔是美国还是英国?
的响应,然后聆听具体响应。
有人可以建议合适的设计模式吗?
I've got an app which responds to vocal commands. Examples include: Read Headlines
, Start Visual Srudio
. The app provides feedback via TTS.
I'dlike to expand the app to be modular. Each module should be able to:
- Extend the list of known commands
- Open UI windows (which the parent App must be aware of so it can close them on command in a standard way)
- Queue text for the TTS engine - probably with a priority flag
- Support dialogue back-and-forth -
Show a map of Bristol
might evoke a response of Bristol USA or England?
and then listen for the specific response.
Can someone suggest an appropriate design pattern?
发布评论
评论(1)
如果您想设计语音命令反馈,架构方法很简单且既定。基于事件的源/侦听器方法以及可以订阅事件并响应事件的插件应该可行
对于对话系统,设计更为复杂。对话管理要求您有一个树状的知识空间表示,并有一个管理器来跟踪对话进度。建议尝试阅读 CMU 的 Olympus 系统,以熟悉所使用的概念和决策:
http://wiki.speech.cs.cmu.edu/olympus/index.php/Olympus
Bohus,Dan 和Alexander I. Rudnicky (2009),“RavenClaw 对话管理框架:架构和系统”,计算机语音与系统语言
http://www.sciencedirect.com/science/article/B6WCW-4TVJ3KG-1/2/d6bfd64173650f150219cf4a43a51a66
rel= " Alexander I. Rudnicky (2003),“RavenClaw:使用分层任务分解和期望议程的对话管理”,Eurospeech 2003
http://research.microsoft.com/~dbohus/docs/ravenclaw.ps
通过自学习和语义信息提取您需要前往 CALO 项目的出版物,该项目以 SIRI
https://pal. sri.com/Plone/framework/Components
很好地解释了该系统如何响应、学习和反应。
If you want to design the spoken command feedback, the architecture approach is simple and established. Event-based source/listener approach with plugins which can subscribe to events and respond to them should work
For dialog system the design is more complex. Dialog management require you to have a tree-like knowledge space representation and to have a manager to track the dialog progress. It's recommended to try and read about Olympus system from CMU to become familar with the concepts and decisions used:
http://wiki.speech.cs.cmu.edu/olympus/index.php/Olympus
Bohus, Dan & Alexander I. Rudnicky (2009), "The RavenClaw dialog management framework: Architecture and systems", Computer Speech & Language
http://www.sciencedirect.com/science/article/B6WCW-4TVJ3KG-1/2/d6bfd64173650f150219cf4a43a51a66
Bohus, Dan & Alexander I. Rudnicky (2003), "RavenClaw: Dialog Management Using Hierarchical Task Decomposition and an Expectation Agenda", Eurospeech 2003
http://research.microsoft.com/~dbohus/docs/ravenclaw.ps
For more complex design with self-learning and semantic information extraction you need to head to the publications on CALO project which ended in SIRI
https://pal.sri.com/Plone/framework/Components
It's all well explained how this system responds, learns and reacts.