第 19 章语音处理

发布于 2024-02-10 15:26:30 字数 4013 浏览 0 评论 0 收藏 0

第 19 章语音处理

19.1. TTS(Text To Speech) 文本转语音

19.1.1. 安装 pyttsx3
19.1.2. 演示
19.1.3. 方法详解
19.1.4. 例子

19.2. STT(Speech To Text) 语音转文本

19.2.1. 安装
19.2.2. 查看麦克风列表
19.2.3. PocketSphinx 文件转文本
19.2.4. Google Cloud Speech API
19.2.5. IBM Speech to Text

19.3. Baidu AipSpeech

19.1. TTS(Text To Speech) 文本转语音

pyttsx3 - 语音朗读

TTS(Text To Speech) 译为从文本到语音，TTS是人工智能AI的一个模组，是人机对话的一部分，即让机器能够说话。

TTS是语音合成技术应用的一种，首先采集语音波形，然后进行优化处理，最后存储在数据库中，合成语音是提取波形转换成自然语音输出。

TTS 有哪些应用场景

TTS 能帮助有视觉障碍的人阅读计算机上的信息
懒人听书，很多人没有时间读书，我们可以通过TTS将书中的内容朗读出来
与声音识别程序一起使用，实现人机交互，例如客服系统的对话脚本
不方便视觉交互场景，例如驾驶汽车，我们可以将短信朗读出来，来电电话号码朗读出来
公交车报站

19.1.1. 安装 pyttsx3

pip install pyttsx3

19.1.1.1. Linux

	
[root@gitlab ~]# dnf install espeak-ng

	
libespeak.so.1: cannot open shared object file: No such file or directory

19.1.2. 演示

#coding=utf-8
import pyttsx3
pyttsx3.speak("Hello World!")

19.1.3. 方法详解

19.1.3.1. say() 方法

speak() 实际上是下面代码的封装

	
#coding=utf-8
import pyttsx3
engine = pyttsx3.init()
engine.say("Hello World!")
engine.runAndWait()

19.1.3.2. save_to_file()

	
engine.save_to_file(text, 'test.mp3')

19.1.3.3. 调整人声类型

男性（voices[0].id）、女性（voices[1].id）

	
voices = engine.getProperty('voices')  
engine.setProperty('voice', voices[0].id)

19.1.3.4. 调整语速

一般范围一般在0~500之间

	
rate = engine.getProperty('rate')
engine.setProperty('rate', 200)

19.1.3.5. 调整声量

范围在0~1之间

	
volume = engine.getProperty('volume')                         
engine.setProperty('volume',0.8)

19.1.3.6. 查看语音引擎

	
voices = engine.getProperty('voices') 
for item in voices:
    print(item)

19.1.4. 例子

import pyttsx3
engine = pyttsx3.init() # object creation

""" RATE"""
rate = engine.getProperty('rate')   # getting details of current speaking rate
print (rate)                        #printing current voice rate
engine.setProperty('rate', 125)     # setting up new voice rate


"""VOLUME"""
volume = engine.getProperty('volume')   #getting to know current volume level (min=0 and max=1)
print (volume)                          #printing current volume level
engine.setProperty('volume',1.0)    # setting up volume level  between 0 and 1

"""VOICE"""
voices = engine.getProperty('voices')       #getting details of current voice
#engine.setProperty('voice', voices[0].id)  #changing index, changes voices. o for male
engine.setProperty('voice', voices[1].id)   #changing index, changes voices. 1 for female

engine.say("Hello World!")
engine.say('My current speaking rate is ' + str(rate))
engine.runAndWait()
engine.stop()

"""Saving Voice to a file"""
# On linux make sure that 'espeak' and 'ffmpeg' are installed
engine.save_to_file('Hello World', 'test.mp3')
engine.runAndWait()

分享到QQ

分享到微博