将 Dragon NaturallySpeaking 的所有输入重定向到 Python? (使用 Natlink)

发布于 2024-12-23 11:23:56 字数 1605 浏览 1 评论 0原文

我目前正在编写一个人工智能程序,该程序接收来自 Dragon NaturallySpeaking (使用 Natlink)的输入,对其进行处理并返回语音输出。我能够想出一个接收器语法库,它捕获来自 Dragon 的所有输入并将其发送到我的解析器。

    class Receiver(GrammarBase):

        gramSpec = """ <start> exported = {emptyList}; """

        def initialize(self):
            self.load(self.gramSpec, allResults = 1)
            self.activateAll()

        def gotResultsObject(self, recogType, resObj):
            if recogType == 'reject':
                inpt, self.best_guess = [], []
            else:
                inpt = extract_words(resObj)
                inpt = process_input(inpt) # Forms a list of possible interpretations
                self.best_guess = resObj.getWords(0)
            self.send_input(inpt)

        def send_input(self, inpt):
            send = send_to_parser(inpt) # Sends first possible interpretation to parser
            try:
                while True:
                    send.next() # Sends the next possible interpretation if the first is rejected
            except StopIteration: # If all interpretations are rejected, try sending the input to Dragon
                try:
                    recognitionMimic(parse(self.best_guess))
                except MimicFailed: # If that fails too, execute all_failed
                    all_failed()

该代码按预期工作,但存在几个问题:

  1. Dragon 在将输入发送到我的程序之前对其进行处理。例如,如果我说“打开 Google Chrome。”,它会打开 Google Chrome,然后将输入发送到 Python。有没有办法将输入发送到 Python 而无需先对其进行处理?

  2. 当我调用waitForSpeech()时,会弹出一个消息框,指出Python解释器正在等待输入。是否可以(为了美观和方便)阻止消息框显示,而是在用户明显暂停后终止语音收集过程?

谢谢你!

I am currently writing an AI program that receives input from Dragon NaturallySpeaking (using Natlink), processes it, and returns a spoken output. I was able to come up with a Receiver GrammarBase that captures all input from Dragon and sends it to my parser.

    class Receiver(GrammarBase):

        gramSpec = """ <start> exported = {emptyList}; """

        def initialize(self):
            self.load(self.gramSpec, allResults = 1)
            self.activateAll()

        def gotResultsObject(self, recogType, resObj):
            if recogType == 'reject':
                inpt, self.best_guess = [], []
            else:
                inpt = extract_words(resObj)
                inpt = process_input(inpt) # Forms a list of possible interpretations
                self.best_guess = resObj.getWords(0)
            self.send_input(inpt)

        def send_input(self, inpt):
            send = send_to_parser(inpt) # Sends first possible interpretation to parser
            try:
                while True:
                    send.next() # Sends the next possible interpretation if the first is rejected
            except StopIteration: # If all interpretations are rejected, try sending the input to Dragon
                try:
                    recognitionMimic(parse(self.best_guess))
                except MimicFailed: # If that fails too, execute all_failed
                    all_failed()

This code works as expected, but there are several problems:

  1. Dragon processes the input before sending it to my program. For example, if I were to say "Open Google Chrome.", it would open Google Chrome, and then send the input to Python. Is there a way to send the input to Python without first processing it?

  2. When I call waitForSpeech(), a message box pops up, stating that the Python interpreter is waiting for input. Is it possible (for aesthetics and convenience) to prevent the message box from showing up, and instead terminate the speech collecting process after a significant pause from the user?

Thank you!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

客…行舟 2024-12-30 11:23:57

关于你的第一个问题,事实证明 DNS 使用“Open ...”话语作为其内部命令解析过程的一部分。这意味着 DNS 在 natlink 有机会之前解析语音并执行命令方式。解决这个问题的唯一方法是在 natlink 语法中将话语从“Open ...”更改为“Trigger ...”(或者更改为除“Trigger”之外 DNS 不使用的其他话语)。

一些 natlink 开发人员常在 Speechcomputing.com 上闲逛。您可能会在那里得到更好的答复。

祝你好运!

With respect to your first question, it turns out that DNS uses the "Open ..." Utterance as part of its command resolving process internally. This means that DNS resolves the speech and executes the command way before natlink has a chance at it. The only way around this is to change the utterance from "Open ..." to "Trigger ..." in your natlink grammar (or to some other utterance that DNS is not using besides "Trigger").

Some of the natlink developers hang out at speechcomputing.com. You may get better responses there.

Good luck!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文