合成歌唱
这是 90 年代末的作品…… http://www.cs.princeton .edu/~prc/SingingSynth.html
为什么这还没有流行起来? (我们可以合成像照片一样真实的图像,但是歌唱的合成......似乎仍然处于非常原始的阶段)。
到底是什么原因导致歌唱合成困难呢?
http://www.interspeech2007.org/Technical/synthesis_of_singing_challenge.php <- - 看起来仍然很原始。
So this is from the late 90s ... http://www.cs.princeton.edu/~prc/SingingSynth.html
Why hasn't this taken off? (We can synthesize photorealistic like images, but the synthesis of singing ... still seems to be in very primitive stages).
What exactly is it that makes the synthesis of singing difficult?
http://www.interspeech2007.org/Technical/synthesis_of_singing_challenge.php <-- still seems primitive.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我的感觉是,我们更容易进入声音的恐怖谷,而不是图像。虽然我们的大脑相对较好地接受了结构不良的图像,但它不会接受结构不良的声音,除非它听起来很自然。一切听起来不完美的东西听起来都令人毛骨悚然,这对实际应用造成了非常大的障碍。这对于公告和电话服务很有用,但我们距离完全合成的歌唱还有很长的路要走。
另一方面,实际声音的修改每天都会进行,无论是现场还是录音室。如果没有自动调谐,所有的“gangsta”和“lady gaga”都会起作用更适合他们的实际才华。
My feeling is that we get into the uncanny valley for sounds easier than for images. While our brain accepts a badly formed image relatively well, it does not accept a badly formed sound unless it sounds natural. Everything that does not sound perfectly unperfect sounds creepy, and this makes a very strong barrier to actual applications. It is good for announcements and telephone services, but we are a long way from totally synthetic singing.
On the other hand, modification of actual voices is daily performed, both live and in studio. Without Autotune all the "gangsta" and "lady gagas" out there would do a job more suited to their actual talent.