我正在尝试编写一个独特的应用程序,并使用语音命令来触发应用程序内的特定功能
如果有人能帮助我解决这个问题,我将永远感激他们。
在不陷入细节的情况下,我正在尝试编写一个应用程序,以便 例如,当应用程序当前启动时,如果我说这些话, “激活功能 A”是我的应用程序中已存在的特定功能,已激活。
我已经解释清楚了吗?换句话说,手机屏幕上有一个按钮 上面写着“功能A”。当软件“武装”并处于监听模式时,我想要 用户能够简单地说“激活功能A”, (或我选择的任何其他短语)并且屏幕选项将被选择,而不需要 用户用手按下按钮,而是选择/激活该选项 通过语音命令。
我和我的程序员在整合这种新的语音命令功能时遇到了困难, 例如,尽管显然可以使用语音命令进行谷歌搜索。 其他语音命令应用程序目前正在流通,例如短信听写应用程序, 电子邮件编写应用程序等,因此显然可以创建语音命令应用程序。 有谁知道这是否可行,如果可以,您对如何实施有建议吗 这个功能?
问题2
假设我们无法通过语音命令激活功能A,是否可以 使用语音命令使手机拨打电话,并且接听该电话 通过我们的服务器?然后服务器“ping”iPhone 并指示它激活功能 A? 为了使此解决方法发挥作用,我需要能够确定确切的短语。 换句话说,不能强迫用户使用“调用功能A”这个词。我需要 能够选择启动该功能的短语。 希望我已经说清楚了。
换句话说,作为解决我们一直面临的障碍的潜在解决方法 使用语音命令激活我们应用程序中的特定功能,是否可能 利用手机中已有的语音命令功能?又名,放置 一个电话?然后这个调用被我们的服务器接收到,服务器 相应地 ping 发出呼叫的电话,并指示其激活该功能?
我显然明白该应用程序在发布之前当前启动的必要性 我的应用程序可以接收来自服务器的指令。
如果有人可以帮助我解决这个令人烦恼的问题,那么说并不夸张 你会改变我的生活!
预先非常感谢你们中的一位善良的灵魂可以提供的任何帮助!
迈克尔
If anyone can help me with this, I'd be eternally in their debt.
Without getting bogged down in details, I'm trying to program an app so
that, for instance, while the application is currently launched, if I say the words,
"activate function A", a specific function which already exists in my app, is activated.
Have I explained myself clearly? In other words, on the screen of the phone is a button
which says "function A". When the software is "armed" and in listening mode, I want
the user to have the ability to simply say the words "activate function A",
(or any other phrase of my choice) and the screen option will be selected without requiring
the user to press the button with their hand, but rather, the option is selected/activated
via voice command.
My programmers and I have faced difficulties incorporating this new voice command capability,
even though it is obviously possible to do google searches with voice command, for instance.
Other voice command apps are currently in circulation, such as SMS dictation apps,
email writing apps, etc, so it is clearly possible to create voice command apps.
Does anyone know if this is possible, and if so, do you have advice on how to implement
this function?
QUESTION 2
Assuming that we are unable to activate function A via voice command, is it possible
to use voice command to cause the phone to place a call, and this call is received
by our server? The server then 'pings' the iPhone and instructs it to activate function A?
For this workaround to work, I would need the ability to determine the exact phrase.
In other words, the user can't be forced to use the word "call function A". I need the
ability to select the phrase which launches the function.
Hopefully I've been clear.
In other words, as a potential workaround to the obstacles we've been facing regarding
using voice command to activate a specific function within our app, is it possible
to harness the voice command capability already present in the phone? aka, to place
a phone call? And then this call is received by our server, and the server
accordingly pings the phone which placed the call, and instructs it to activate the function?
I obviously understand the necessity for the app to be currently launched, before it
would be possible for my application to receive the instruction from the server.
If someone can help me to solve this vexing problem, it is not hyperbole to say that
you would change my life!
Thanks so much in advance for any help one of you kind souls can provide!!!
Michael
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不相信 iPhone 具有任何内置的语音识别功能。考虑与 Nuance 讨论购买并嵌入他们的语音识别引擎之一。他们有 iPhone 版的 DragonDictate,但他们也提供了大量其他服务于不同功能的识别引擎。嵌入式解决方案显然是他们的专业领域之一。
将音频推送到服务器的其他路径可能比您预期的更复杂。通常,此过程涉及端点指向(何时存在语音)和基本特征识别,因此不需要传递原始流。同样,对您打算使用的语音识别引擎进行调查可能会为您提供所需的数据处理详细信息。将连续的原始语音从所有电话传递到服务器可能不切实际。
I don't believe the iPhone comes with any built in speech recognition functions. Consider speaking to Nuance about buying and embedding one of their speech recognition engines. They have DragonDictate for iPhone, but they also provide a fair amount of other recognition engines that serve different functions. Embedded solutions is clearly one of their areas of expertise.
Your other path of pushing the audio to your server may be more involved than you expect. Typically this process involves end-pointing (when is speech present) and identification of basic characteristics so the raw stream doesn't need to be passed. Again, investigation into the speech recognition engine you intend to use may provide you with the data processing details you need. Passing continuous, raw voice from all phones to your servers is probably not going to be practical.