谷歌语音识别无法识别某些单词/短语，例如“嗯”和“呃”| Python

发布于 2025-01-18 07:38:31 字数 1109 浏览 6 评论 0原文

因此，Google的语音识别似乎正在消除我的演讲的某些部分，例如UM，ER和AHH。问题是我希望这些被认可，我似乎无法弄清楚如何启用这一点。

这是代码：

import pyttsx3

recognizer = speech_recognition.Recognizer()

vocal_imperfections = 0

vi_list = ['hmm', 'umm', 'aha', 'ahh', 'uh', 'um', 'er']

while True:
    try:
        with speech_recognition.Microphone() as mic:
            recognizer.adjust_for_ambient_noise(mic, duration=0.2)
            audio = recognizer.listen(mic)
            text = recognizer.recognize_google(audio, language='en-IN', show_all=True)
            #text = recognizer.recognize_ibm(audio)
            if text != []:
                text = text['alternative'][0]['transcript']
                if any(word in text for word in vi_list):
                    vocal_imperfections = vocal_imperfections+1
                print(text)
                print(vocal_imperfections)


    except speech_recognition.UnknownValueError():
        recognizer = speech_recognition.Recognizer()
        continue

它可以按求所需的工作，只是Google拿出了人声瑕疵。有谁知道如何启用这一点，或者替代自由的实时语音识别，以识别声音瑕疵？

例子：如果我说：“嗯，我认为今天是第30位” Google会返回：“我认为今天是第30位”

原文

So it seems google speech recognition is taking out certain parts of my speech like um, er and ahh. The problem is I want these to be recognized, I can not seem to figure out how to enable this.

Here is the code:

import pyttsx3

recognizer = speech_recognition.Recognizer()

vocal_imperfections = 0

vi_list = ['hmm', 'umm', 'aha', 'ahh', 'uh', 'um', 'er']

while True:
    try:
        with speech_recognition.Microphone() as mic:
            recognizer.adjust_for_ambient_noise(mic, duration=0.2)
            audio = recognizer.listen(mic)
            text = recognizer.recognize_google(audio, language='en-IN', show_all=True)
            #text = recognizer.recognize_ibm(audio)
            if text != []:
                text = text['alternative'][0]['transcript']
                if any(word in text for word in vi_list):
                    vocal_imperfections = vocal_imperfections+1
                print(text)
                print(vocal_imperfections)


    except speech_recognition.UnknownValueError():
        recognizer = speech_recognition.Recognizer()
        continue

It works as wanted just google takes out the vocal imperfections. Does anyone know how to enable this, or alternative free real time speech recognition that will recognize vocal imperfections?

Example:
If I were to say: "um, I think today is the 30th"
Google would return: "I think today is the 30th"

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

耶耶耶 2025-01-25 07:38:31

我查看了 Google Cloud Speech-to-text API 文档并且没有看到任何相关内容（截至 2022 年 3 月）。我还遇到了这些相关资源：

检测语音转文本中的填充词< /一>
<一href="https://www.quora.com/How-can-I-detect-filler-words-like-ah-um-using-a-speech-to-text-API-like-Google-Speech-API " rel="nofollow noreferrer">如何使用 Google Speech API 等语音转文本 API 检测“啊，嗯”等填充词？（Quora）
FillerWordShock - 一个人对此主题的研究

所有证据都表明它不是'无法使用 Google Cloud 语音转文本服务（目前），并且您必须寻求替代服务。我不会重复资源中列出的替代方案，但提供了几种选择，您必须选择最适合您的特定需求的一种。

另外，您可能已经知道这一点（如果您知道的话，我们深表歉意），但这些类型的单词通常称为“填充”和/或“犹豫”单词。这可能对您研究该主题有帮助。

好消息是 SpeechRecognition 模块（我认为这就是您根据代码使用的模块））支持多种不同的引擎，因此希望其中之一提供填充词。