在 Android 上使用 TTS:大声朗读标点符号
上下文:我的应用程序正在将句子发送到用户拥有的任何 TTS 引擎。句子是用户生成的,可能包含标点符号。
问题:一些用户报告标点符号在 SVOX、Loquendo 和其他可能的系统上大声朗读(TTS 表示“逗号”等)。
问题:
- 我应该删除所有标点符号吗?
- 我应该使用这种API来转换标点符号吗?
- 我应该让 TTS 引擎处理标点符号吗?
在 Loquendo 中发现问题的同一用户,在另一个名为 FBReader 的 Android 应用程序中则没有此问题。所以我认为第三种选择不是正确的选择。
CONTEXT: My application is sending sentences to whatever TTS engine the user has. Sentences are user-generated and may contain punctuation.
PROBLEM: Some users report that the punctuation is read aloud (TTS says "comma" etc) on SVOX, Loquendo and possibly others.
QUESTION:
- Should I strip all punctuation?
- Should I transform the punctuation using this kind of API?
- Should I let the TTS engine deal with the punctuation?
The same user that sees the problem with Loquendo, does not have this problem with another Android application called FBReader. So I guess the 3rd option is not the right thing to do.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我的一个应用程序也遇到了同样的问题。
输入字符串为:
10 分钟后的下一个闹钟,将是下午 2:45
,TTS 引擎会说:
10 分钟后的下一个闹钟逗号,它将是下午 2:45
代码>.只需在逗号后添加一个空格即可解决问题,如下所示:
10 分钟后下一个闹钟,将是下午 2:45
这是一个愚蠢的错误,也许您的问题比这更复杂,但这对我有用。 :)
I had the same problem with one of my apps.
The input string was:
Next alarm in 10 minutes,it will be 2:45 pm
and the TTS engine would say:
Next alarm in 10 minutes comma it will be 2:45 pm
.The problem was fixed just by adding a space after the comma like this:
Next alarm in 10 minutes, it will be 2:45 pm
This is a stupid mistake, and maybe your problem is more complicated than that, but it worked for me. :)
因此,您担心用户可能碰巧选择了哪种后巷获得的文本转语音引擎作为默认引擎...大概是因为您不希望您的应用程序由于该引擎的未知/而看起来很糟糕不良行为。可以理解。
但(好的)事实是,TTS 的行为实际上不是您的责任,除非您决定在应用程序本身中嵌入引擎(难度:难,推荐?否)。
引擎可以而且应该被认为遵守此处...并且假定在 Android 系统设置(home\settings\language&locale\TTS)中提供了自己足够的配置选项集,其中可能包含也可能不包含发音选项。还应该假定用户足够聪明来安装他们满意的引擎。
承担预测和“纠正”未知和不需要的引擎行为的工作(至少在您没有亲自测试过的引擎中)是一个滑坡。
一个简单而好的选择(难度:简单):
更好的选择(难度:中等):
另外,需要注意的一件事是,引擎之间存在很多很多差异(是否使用嵌入式语音与在线语音、响应时间、初始化时间、可靠性/遵守 Android 规范、跨 Android API 级别的行为、跨自己版本历史记录的行为,声音的质量,更不用说语言能力了)...对于用户来说,差异可能比标点符号是否发音更重要。
您说“我的应用程序正在向用户拥有的任何 TTS 引擎发送句子。”嗯……“这就是你的问题。”为什么不让用户选择使用什么引擎呢?
并引导我们......
一个更好的选择(难度:困难和好![以我的拙见]):
旁注:
不要让 SVOX/PICO(模拟器)引擎让您太担心 - 它有很多缺陷,甚至没有设计或保证在 API ~20 以上的 Android 上运行,但仍然包含在模拟器图像高达 API ~24,导致“不可预测的结果”,实际上并不反映现实。我还没有在过去七年左右的时间里在任何真正的硬件设备上看到过这个引擎。
既然你说“句子是用户生成的”,我会更担心解决他们将用什么语言输入的问题!我会注意这个问题! :)
So, you're worried about what back-alley-acquired text-to-speech engine the user might happen to have selected as their default... presumably because you don't want your app to look bad due to this engine's unknown/bad behavior. Understandable.
The (good) fact is, though, that the TTS's behavior is not actually your responsibility unless you decide to embed an engine in the app itself (Difficulty: Hard, Recommended? No).
Engines can and should be presumed to adhere to Android rules and behaviors dictated here... and presumed to supply their own sufficient set of configuration options in the Android system settings (home\settings\language&locale\TTS) which may or may not include pronunciation options. The user should also be presumed intelligent enough to install an engine that they are satisfied with.
It is a slippery slope to take on the job of anticipating and "correcting" for unknown and unwanted engine behaviors (at least in engines that you haven't tested yourself).
A SIMPLE AND GOOD OPTION (Difficulty: Easy):
A BETTER OPTION (Difficulty: Medium):
Also, one thing to note is that there are many, many differences between engines (whether they use embedded voices vs online, response time, initialization time, reliability/adherence to Android specs, behavior across Android API levels, behavior across their own version history, the quality of voices, not to mention language capability)... differences that may be even more important to users than whether or not punctuation is pronounced.
You say "My application is sending sentences to whatever TTS engine the user has." Well... "That's yer problem right there." Why not give the user a choice on what engine to use?
And leads us to...
AN EVEN BETTER OPTION (Difficulty: Hard and Good! [in my humble opinion]):
SIDE-NOTES:
Don't let the SVOX/PICO (emulator) engine get you too worried -- it has many flaws and is not even designed or guaranteed to run on Android above API ~20, but is still included on emulators images up to API ~24, resulting in "unpredictable results" that don't actually reflect reality. I have yet to see this engine on any real hardware device made within the last seven years or so.
Since you say that "sentences are user generated," I would be more worried about solving the problem of what language they are going to be typing in! I'll look out for a question on that! :)