如何让电脑唱歌
我正在尝试开发一个在线应用程序,用户可以在其中编写一些文本,然后软件将其返回给用户。
我目前可以使用 espeak 生成包含计算机说出的单词的音频文件,但我不知道如何使其听起来像歌曲,如何为其添加节奏。
我可以使用橡皮筋改变音调和节奏,但这只是我所能做到的。
有谁知道如何实现这一点?
I'm trying to develop an online application where the user writes some text and the software sings it back to the user.
I can currently generate the audio file with the words spoken by the computer using espeak, but I have no idea how to make it sound like a song, how to add rhythm to it.
I'm able to change the pitch and tempo using rubberband, but that's as far as I've gotten.
Does anyone have a clue how to make this happen?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果你想使用橡皮筋来改变持续时间和音高,那么我认为困难的部分是将文本中的音素/音节映射到语音系统输出中相应的音频范围,对此我没有简单的建议。 (理想情况下,您可以进入语音合成器,以便它为您提供从音素到音频位置的映射。)
更简单的替代方案可能是尝试语音合成器标记语言 - SSML。它具有“音调”和“持续时间”元素,可以绝对指定以赫兹为单位的音调和以秒为单位的持续时间。您还可以指定音量以控制动态。
鉴于此,您可以尝试将文本转换为 SSML 文档,并使用音高/持续时间和音量属性标记单词/音节/音素。
If you want to use rubberband to change duration and pitch, then I think the hard part is going to be mapping from phonemes/syllables in the text to corresponding audio ranges in the speech systhesis output, for which I have no simple suggestion. (Ideally you'd get inside the speech synthesiser so that it would provide you with the mapping from phonemes to audio location.)
A simpler alternative might be to try Speech Synthesizer Markup Language - SSML. It has a "pitch" and "duration" elements that can absolutely specify pitch in Hz and duration in seconds. You can also specify volume, for controlling dynamics.
Given this, you could try to convert the text into a SSML document, and mark up words/syllables/phonemees with pitch/duration and volume attributes.
我最终使用了 Festival 的歌唱模式。听起来相当不错,但它只适用于英语语音。
I've ended up using Festival's singing mode. It sounds reasonably well, except for the fact it only works with English voices.