针对亚洲少数语言的 TTS 语音合成器开发

发布于 2024-10-14 03:14:37 字数 250 浏览 3 评论 0原文

我想组建一个团队来为各种亚洲语音语言开发 TTS 语音合成器。我已经安排了语言专家。最终产品将是 1. 一个 Android 手机应用程序,以及 2. 一个基于网络的 TTS 服务。

按实现顺序排列的语言:
面容
苗族
老挝

前两个有基于拉丁语的正字法。

我的问题是: 在编程方面,我的团队需要谁?我需要什么技能/编程语言?

I am wanting to put together a team to develop a TTS Speech Synthesizer for various phonetic Asian Languages. I have the language experts lined up. The final product will be 1. an android phone app, and 2. a web-based TTS service.

Languages in order of implementation:
Mien
Hmong
Lao

The first two have Latin based orthographies.

My Question is:
On the programming side, who do I need on my team? What skills/programming languages am I looking for?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

吻风 2024-10-21 03:14:37

某人有:
1. 具有 DSP(数字信号处理)背景,重点关注语音和音频信号处理将是一个不错的选择。
2. 当然,此人必须具有良好到出色的编程技能。
3. 喜欢学习新语言。程序员需要学习要开发 TTS 引擎的语言,或者至少对其有基本的了解,才能“知道”正在编码的内容,甚至可能根据现有算法进行即兴创作。

您可以查看 FestVox 和 Festival 页面 CMU 的语音识别器和 TTS 引擎,看看它们是用什么语言开发的。这可能会给您一个更好的想法。

TTS 引擎几乎处于语言科学、DSP(用于后端)和用于实现上述所有内容的软件工程的十字路口。我认为您需要一些 DSP 和软件人员来完善您的团队。

祝一切顺利,希望有所帮助,
斯里拉姆。

Someone with:
1. A DSP (digital signal processing) background with an emphasis on speech and audio signal processing would be a good bet.
2. Of course, the person must have good to great programming skills.
3. A liking for learning new languages. Learning the language for which the TTS engine is to be developed, or at least having a rudimentary understanding of it is needed for a programmer to "know" what is being coded and maybe even improvise upon existing algorithms.

You can take a look at the FestVox and Festival pages CMU's speech recogniser and TTS Engine and see what languages they develop in. That might give you a better idea.

TTS Engines sit almost at the cross-roads of linguistic science, DSP (for the backend) and Software Engineering for implementing all the above. I think you need some DSP and Software guy(s) to complete your team.

All the best and hope that helps,
Sriram.

凉栀 2024-10-21 03:14:37

您是这些语言中的任何一种的母语吗?您绝对需要能流利地讲所有这些语言,并且除了具有良好 DSP 技能的程序员之外,您的团队中很可能还需要一名语言学家。您列出的语言都是“声调”语言。因此,除了在构建罗曼语(法语、西班牙语、意大利语等)或日耳曼语(英语、德语等)语言的文本转语音系统时遇到的常见挑战之外,您还必须处理调性也是如此。在声调语言中,您可以拥有多个具有基本相同发音的单词(至少它们在未经训练的西方人的耳朵中听起来相同),并且它们甚至可能具有相同的拉丁语正字法。它们在语音中的唯一区别是单词相对于句子中其他单词的音高或在说出单词时发生的音高变化。如果你很不幸,这些单词确实具有相同的拉丁语正字法,那么你的团队中就需要具有人工智能专业知识的人,因为你的程序必须从句子的上下文中识别出哪个单词,才能生成正确的声音。

祝你的项目好运!

Are you a native speaker of any of these languages? You are absolutely going to need fluent speakers of all these languages and, quite possibly, a linguist on your team in addition to programmers with good DSP skills. The languages you've listed are all "tonal" languages. So, in addition to the usual challenges you would encounter in building a Text-To-Speech system for Romance (French, Spanish, Italian, etc.) or Germanic (English, German, etc.) languages, you will have to deal with tonality as well. In a tonal language, you can have multiple words that have essentially the same pronunciation (at least they sound the same to the ears of an untrained Westerner) and they may even have the same Latin orthography. The sole difference between them in speech is the pitch of the word relative to other words in the sentence or a change in pitch that occurs as the word is spoken. If you are unlucky and these words do have the same Latin orthography, then you have a need for someone on the team with expertise in artificial intelligence because your program will have to recognize which word is intended from its context in a sentence in order to produce the correct sound.

Good luck with your project!

破晓 2024-10-21 03:14:37

简短的回答是很多。

从头开始开发一个有价值的 TTS 需要花费数万个小时的编程时间。

另一种选择是您可以与我们合作,我们可以为您提供在一段时间内独家使用这些语言的权利。

我工作的公司是世界领先的移动和汽车文本转语音软件和语言提供商。

通过使用我们的引擎,您可以节省数年的精力和数十万的开发成本,并能够在几个月内交付这些语言。

干杯,尼尔

The short answer is lots.

To develop a worthwhile TTS from scratch would take tens of thousands of programming hours.

An alternative is that you could work in partnership with us and we can give you an exclusive to use those languages for a period of time.

The company I work with are the worlds leading provider of mobile and automotive Text to Speech software and languages.

By using our engine, you would save years of effort and hundreds of thousands in development costs, and be able to deliver those languages within a couple of months.

Cheers, Neil

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文