读取使用 gTTS 生成的 wav 文件时出现问题

发布于 2025-01-18 13:42:30 字数 2045 浏览 0 评论 0原文

在我的项目中,我尝试使用 gTTS 从字符串生成音频文件,然后操作所述音频,但是当我尝试使用某些库(scipy 和 librosa)打开它时,它说格式无效。我做错了什么?提前致谢

代码:

tts = gTTS('Semáforo Gran Vía a 100 metros', lang='es')

filename = 'Senal.mp3'
tts.save(filename)
#y, sr = librosa.load('Senal.mp3')
x = wavfile.read(filename)

错误: 西皮

文件“C:/Users/Samuel/PycharmProjects/pythonProject/main.py”,第 108 行,位于 转换()

文件“C:/Users/Samuel/PycharmProjects/pythonProject/main.py”,第 25 行,转换中 x = wavfile.read(文件名)

文件“C:\Users\Samuel\PycharmProjects\pythonProject\venv\lib\site-packages\scipy\io\wavfile.py”,第 650 行,读取 file_size, is_big_endian = _read_riff_chunk(fid)

文件“C:\Users\Samuel\PycharmProjects\pythonProject\venv\lib\site-packages\scipy\io\wavfile.py”,第 521 行,在 _read_riff_chunk 中 raise ValueError(f"文件格式 {repr(str1)} 不理解。仅" ValueError:无法理解文件格式 b'\xff\xf3D\xc4'。仅支持“RIFF”和“RIFX”。

利布罗萨

文件“C:/Users/Samuel/PycharmProjects/pythonProject/main.py”,第 103 行,位于 转换()

文件“C:/Users/Samuel/PycharmProjects/pythonProject/main.py”,第 19 行,转换中 y, sr = librosa.load('Senal.mp3')

文件“C:\Users\Samuel\PycharmProjects\pythonProject\venv\lib\site-packages\librosa\util\decorators.py”,第 88 行,inner_f 返回 f(*args, **kwargs)

文件“C:\Users\Samuel\PycharmProjects\pythonProject\venv\lib\site-packages\librosa\core\audio.py”,第 174 行,加载中 y, sr_native = __audioread_load(路径, 偏移量, 持续时间, dtype)

文件“C:\Users\Samuel\PycharmProjects\pythonProject\venv\lib\site-packages\librosa\core\audio.py”,第 198 行,在 __audioread_load 中 以 audioread.audio_open(path) 作为 input_file:

文件“C:\Users\Samuel\PycharmProjects\pythonProject\venv\lib\site-packages\audioread_init_.py”,第 116 行,位于 audio_open 中 引发 NoBackendError() audioread.exceptions.NoBackendError

进程已完成,退出代码为 1

For my project I tried to use gTTS to generate an audio file from a String and then manipulate said audio, but when I tried to open it with some libraries (scipy and librosa), it says that the format is invalid. What am I doing wrong? Thanks in advance

The code:

tts = gTTS('Semáforo Gran Vía a 100 metros', lang='es')

filename = 'Senal.mp3'
tts.save(filename)
#y, sr = librosa.load('Senal.mp3')
x = wavfile.read(filename)

The error(s):
Scipy

File "C:/Users/Samuel/PycharmProjects/pythonProject/main.py", line 108, in
conversion()

File "C:/Users/Samuel/PycharmProjects/pythonProject/main.py", line 25, in conversion
x = wavfile.read(filename)

File "C:\Users\Samuel\PycharmProjects\pythonProject\venv\lib\site-packages\scipy\io\wavfile.py", line 650, in read
file_size, is_big_endian = _read_riff_chunk(fid)

File "C:\Users\Samuel\PycharmProjects\pythonProject\venv\lib\site-packages\scipy\io\wavfile.py", line 521, in _read_riff_chunk
raise ValueError(f"File format {repr(str1)} not understood. Only "
ValueError: File format b'\xff\xf3D\xc4' not understood. Only 'RIFF' and 'RIFX' supported.

Librosa

File "C:/Users/Samuel/PycharmProjects/pythonProject/main.py", line 103, in
conversion()

File "C:/Users/Samuel/PycharmProjects/pythonProject/main.py", line 19, in conversion
y, sr = librosa.load('Senal.mp3')

File "C:\Users\Samuel\PycharmProjects\pythonProject\venv\lib\site-packages\librosa\util\decorators.py", line 88, in inner_f
return f(*args, **kwargs)

File "C:\Users\Samuel\PycharmProjects\pythonProject\venv\lib\site-packages\librosa\core\audio.py", line 174, in load
y, sr_native = __audioread_load(path, offset, duration, dtype)

File "C:\Users\Samuel\PycharmProjects\pythonProject\venv\lib\site-packages\librosa\core\audio.py", line 198, in __audioread_load
with audioread.audio_open(path) as input_file:

File "C:\Users\Samuel\PycharmProjects\pythonProject\venv\lib\site-packages\audioread_init_.py", line 116, in audio_open
raise NoBackendError()
audioread.exceptions.NoBackendError

Process finished with exit code 1

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

一身仙ぐ女味 2025-01-25 13:42:30

似乎对我来说很好。

import librosa

from gtts import gTTS

out_file = "tts_out.mp3"
mytext   = "Bonjour a toi l'ordinateur!"
language = 'fr'
tts      = gTTS(text=mytext, lang=language, slow=False)
tts.save(out_file)
x, sr    = librosa.load(out_file)
print(f"shape       : {x.shape}")
print(f"sample rate : {sr}")

Python = 3.10.8,GTTS = 2.3.0和Librosa 0.9.2输出

shape       : (51333,)
sample rate : 22050

It seems to work fine for me With Python=3.10.8, gTTS=2.3.0, and librosa 0.9.2

import librosa

from gtts import gTTS

out_file = "tts_out.mp3"
mytext   = "Bonjour a toi l'ordinateur!"
language = 'fr'
tts      = gTTS(text=mytext, lang=language, slow=False)
tts.save(out_file)
x, sr    = librosa.load(out_file)
print(f"shape       : {x.shape}")
print(f"sample rate : {sr}")

Output:

shape       : (51333,)
sample rate : 22050
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文