如何简化我们最先进的文本转语音功能?

发布于 2024-09-24 09:46:58 字数 612 浏览 6 评论 0原文

在过去,文字转语音技术虽然很先进,但仍然非常不完善。当你输入一个单词时,它几乎会按照你的拼写方式读出它......单调。很多时候,结果会很有趣。如今,文本转语音技术已经非常智能,不会出现令人发笑的错误。

作为一个个人项目,我想编写一个应用程序,可以恢复这种旧式的文本到语音转换,即使只是作为一个玩具。在.Net 中,我可以使用 System.Speech.dll 和 SpeechLib COM 对象。 (微软语音对象库)两者似乎都使用操作系统内置的文本转语音功能,这又太聪明了。有什么方法可以配置它们以禁用使其变得智能的任何功能吗?

我尝试了几种不同的“SayAs”选项,尝试将区域性设置为不变(例外!),现在我正在考虑 SSML。看起来我必须找到旧技术本身,但我什至不知道从哪里开始。

作为我希望看到的混乱的一个例子,这里有一些 Moonbase Alpha 供您参考:http: //www.youtube.com/watch?v=Hv6RbEOlqRo (确保您戴着耳机!)

欺骗这些新奇的文本到音素转换器、标准化器和无绳电话, 和...

Back in the old days, text-to-speech, as cutting edge as it was, was very imperfect. When you typed in a word, it would pretty much read it how you spelled it... in monotone. Oftentimes, the result would be very funny. Nowadays, Text-to-Speech is too intelligent to goof in ways that can bring a laugh.

As a personal project, I'd like to make up an application that can bring back this old style of text-to-speech, if only as a toy. In .Net, I have available to me both System.Speech.dll and the SpeechLib COM objects. (Microsoft Speech Object Library) Both seem to use the OS's built in Text-to-Speech, which again, is too dang smart. Are there any ways to configure these to disable whatever it is that makes it intelligent?

I've tried a few different 'SayAs' options, I've tried setting the culture to invariant (exception!), and now I'm looking at SSML. It's beginning to look like I'll have to find the old technology itself, but I don't even know where to begin there.

As an example of the chaos I'm hoping to see, here's some Moonbase Alpha for you: http://www.youtube.com/watch?v=Hv6RbEOlqRo (Make sure you are wearing headphones!)

Con flab these newfangled text-to-phoneme converters, and normalizers, and cableless phones, and...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

留蓝 2024-10-01 09:46:58

您可能想要所谓的“NRL 算法”,该算法由 Votrax 语音合成器使用在 20 世纪 70 年代和 20 世纪 80 年代。我记得我的一个朋友有一个我们连接(通过串行端口)到我的 Osborne I 的设备。我们从它“说”东西的方式中得到了很多笑声。例如,“Computer”一词源自“com poo ter”。

或者可能是我朋友拥有的 Microvox。这似乎敲响了警钟。当时,所有文本转语音框都使用几乎相同的技术。链接的文章是信息的源泉。大约中间有一段较长的关于文本到语音转换的部分。它描述了规则和基本算法。我怀疑,通过一些研究和实验,您可以复制 Microvox 的语音合成。

NRL 算法是由 Unix talk 命令 实现的,其来源显然是失去了历史的伟大部分。然而,MD McIlroy 写了一篇关于它的论文。 按规则合成英语语音(它是一个包含扫描页面的 tar 文件)。

如果我有时间的话,这将是一个有趣的项目。祝你好运。如果你有任何进展请告诉我。

You probably want what was called the "NRL Algorithm", which was used by the Votrax speech synthesizers in the 1970s and 1980s. I remember a friend of mine had one of those that we connected (via serial port) to my Osborne I. We got a lot of laughs out of the way it "said" things. "Computer" came out "com poo ter", for example.

Or maybe it was a Microvox that my friend had. That seems to ring a bell. At the time, all the text to speech boxes used pretty much the same technology. The linked article is a fountain of information. About halfway down is a longish section on text to speech conversion. It describes the rules and the basic algorithm. I suspect that, with some study and experimentation, you could duplicate the Microvox's speech synthesis.

The NRL Algorithm was implemented by the Unix speak command, the source of which is apparently lost to the great bit bucket of history. However, M.D. McIlroy wrote a paper about it. Synthetic English speech by rule (it's a tar file containing scanned pages).

This would be a fun project to play with if I had the time. Good luck on it. Let me know if you get anywhere with it.

我恋#小黄人 2024-10-01 09:46:58

好吧,我刚刚偶然发现了旧的“Microsoft Voice Text”库:vtext.dll

这似乎就是我正在寻找的东西!与现代 TTS 库相比,界面非常简单。结果似乎与我链接的视频中的声音并不完全相同,但这可能是不同的实现。不管怎样,是时候回忆一下了。

var tts = new HTTSLib.TextToSpeech();
tts.Speak("ebrbrbrbrbrbrbrbr");

由于某种原因,当我让它说“这里”时,vshost.exe 崩溃了。但由于这只是一个愚蠢的个人项目,我可以忽略它。

Well, I just managed to stumble across the old "Microsoft Voice Text" library: vtext.dll

This seems to be what I was looking for! Compared to modern TTS libraries, the interface is very simple. The result doesn't seem to be exactly the same as the voice in that video I linked, but that was probably a different implementation. Either way, it's time to reminisce.

var tts = new HTTSLib.TextToSpeech();
tts.Speak("ebrbrbrbrbrbrbrbr");

For some reason it crashes vshost.exe when I make it say "here". But since this is just a dumb personal project, I can ignore it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文