- 部分 I. Python 入门
- 第 1 章 Python 入门
- 第 2 章 Python Package Index (PyPI)
- 第 3 章 Python 模块
- 第 4 章 数据类型
- 第 5 章 数据结构
- 第 6 章 Class
- 第 7 章 Input/Output
- 第 8 章 Pipe
- 第 9 章 Library
- 9.2. 随机数
- 9.3. Python 多线程
- 9.13. syslog
- 9.5. Socket
- 9.6. subprocess
- 9.7. YAML
- 9.8. Daemon
- 9.9. python-memcached
- 9.10. Pyro - Pyro is short for PYthon Remote Objects
- 9.11. Python Imaging Library
- 9.12. getopt – Command line option parsing
- 9.14. python-subversion
- 9.15. SimpleHTTPServer
- 9.16. fuse-python.x86_64 : Python bindings for FUSE - filesystem in userspace
- 9.17. Network
- 9.18. Python-spdylay - Spdylay Python Extension Module
- 9.19. mechanize
- 9.20. Dominate
- 第 10 章 Frameworks
- 第 12 章 终端环境开发
- 部分 II. Python 数据分析
- 第 13 章 Crawler
- 第 14 章 Scrapy - Python web scraping and crawling framework
- 第 15 章 Pandas - Python Data Analysis Library
- 第 16 章 股票
- 第 17 章 数据可视化
- 部分 III. 人工智能 AI
- 第 18 章 OCR
- 第 19 章 语音处理
- 第 20 章 视频
- 第 21 章 人脸识别
- 第 22 章 自然语言处理
- 第 23 章 自动化运维
- 第 24 章 办公自动化
- 第 25 章 OpenCV
- 第 26 章 图形开发
- 第 27 章 3rdparty toolkit
- 第 29 章 实用代码
- 第 30 章 FAQ
文章来源于网络收集而来,版权归原创者所有,如有侵权请及时联系!
19.2. STT(Speech To Text) 语音转文本
19.2. STT(Speech To Text) 语音转文本
https://github.com/Uberi/speech_recognition
19.2.1. 安装
pip install SpeechRecognition
麦克风相关
brew install portaudio pip install pyaudio
运行下面命令授权访问麦克风
neo@MacBook-Pro-Neo ~ % python3 -m speech_recognition
19.2.2. 查看麦克风列表
import speech_recognition as sr for index, name in enumerate(sr.Microphone.list_microphone_names()): print("Microphone with name \"{1}\" found for `Microphone(device_index={0})`".format(index, name))
输出结果
neo@MacBook-Pro-Neo ~/workspace/python/speech % python3 microphone.py Microphone with name "Built-in Microphone" found for `Microphone(device_index=0)` Microphone with name "Built-in Output" found for `Microphone(device_index=1)`
指定麦克风设备
import speech_recognition as sr print(sr.__version__) # just to print the version not required r = sr.Recognizer() mic = sr.Microphone(device_index=1) #my device index is 1, you have to put your device index
噪声抑制
import speech_recognition as sr print(sr.__version__) # just to print the version not required r = sr.Recognizer() my_mic = sr.Microphone(device_index=1) #my device index is 1, you have to put your device index with my_mic as source: print("Say now!!!!") r.adjust_for_ambient_noise(source) #reduce noise audio = r.listen(source) #take voice input from the microphone print(r.recognize_google(audio)) #to print voice into text
19.2.3. PocketSphinx 文件转文本
PocketSphinx默认仅支持英文识别,中文需要下载语言模型文件,Mandarin 为中文普通话。
brew install swig brew install pocketsphinx pip install PocketSphinx
从文件识别
import speech_recognition as sr # obtain audio from the file recognizer = sr.Recognizer() audioFile = sr.AudioFile(r"english.wav") with audioFile as source: audio = recognizer.record(source) # recognize speech using Sphinx try: print("Sphinx thinks you said: " + recognizer.recognize_sphinx(audio)) except sr.UnknownValueError: print("Sphinx could not understand audio") except sr.RequestError as e: print("Sphinx error; {0}".format(e))
从麦克风识别
#!/usr/bin/env python3 import speech_recognition as sr print(sr.__version__) for index, name in enumerate(sr.Microphone.list_microphone_names()): print("Microphone with name \"{1}\" found for `Microphone(device_index={0})`".format(index, name)) # obtain audio from the microphone r = sr.Recognizer() with sr.Microphone() as source: print("Say something!") audio = r.listen(source) # recognize speech using Sphinx try: print("Sphinx thinks you said: " + r.recognize_sphinx(audio)) except sr.UnknownValueError: print("Sphinx could not understand audio") except sr.RequestError as e: print("Sphinx error; {0}".format(e))
19.2.4. Google Cloud Speech API
使用谷歌产品先要会使用科学上网,你懂得!
import speech_recognition as sr r = sr.Recognizer() with sr.Microphone() as source: print("Say something!") audio = r.listen(source) try: text = r.recognize_google(audio) print("You said: " + text) except sr.UnknownValueError: print("Google Speech Recognition could not understand audio") except sr.RequestError as e: print("Could not request results from Google Speech Recognition service" + format(e))
指定默认语言
text = r.recognize_google(audio, language='zh-CN', show_all= True) text = r.recognize_google(audio_data, language=”es-ES”)
19.2.5. IBM Speech to Text
使用IBM的服务需要一个云账号 IBM Cloud,如你你没有请先注册一个账号,然后创建 Speech To Text 服务。
测试 Speech to Text 是否正常工作
neo@MacBook-Pro-Neo ~/workspace/python/speech % wget https://watson-developer-cloud.github.io/doc-tutorial-downloads/speech-to-text/audio-file.flac neo@MacBook-Pro-Neo ~/workspace/python/speech % curl -X POST -u "apikey:eXuTdDOg_l7Ljp5bV8NpFsswVq58ebf2Kr-K5dpp5SZK" \ --header "Content-Type: audio/flac" \ --data-binary audio-file.flac \ "https://api.au-syd.speech-to-text.watson.cloud.ibm.com/instances/8a7df79c-c8fe-4e31-8000-c44bbd025b22/v1/recognize"
#!/usr/bin/env python3 import speech_recognition as sr import ssl ssl._create_default_https_context = ssl._create_unverified_context # obtain path to "english.wav" in the same folder as this script from os import path # AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), "english.wav") # AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), "french.aiff") AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), "chinese.flac") print(AUDIO_FILE) # use the audio file as the audio source r = sr.Recognizer() with sr.AudioFile(AUDIO_FILE) as source: audio = r.record(source) # read the entire audio file try: print("IBM Speech to Text thinks you said " + r.recognize_ibm(audio, username="netkiller@msn.com", password="******")) except sr.UnknownValueError: print("IBM Speech to Text could not understand audio") except sr.RequestError as e: print("Could not request results from IBM Speech to Text service; {0}".format(e))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论