SpeechSynthezier.PhonemeReached 事件和控制字符

发布于 2024-07-08 13:44:02 字数 846 浏览 9 评论 0原文

我正在创建一个小型 silverlight 小部件,它可以发音一个单词,并在发音时突出显示每个音节。

作为其中的一部分,我使用 SpeechSynthesizer .PhonemeReached 事件 确定每个音素的开始和结束时间(作为计算每个音节的开始和结束时间的步骤)。

奇怪的是 PhonemeReachedEventArgs.Phoneme 属性 有时是一个控制字符,至少(但可能不排他) U+0004 传输结束,包括不是传输结束的地方(比如开始)。

我找不到任何文档说明这是什么意思,有人知道吗?

编辑:澄清一下,我不是在 Silverlight 中进行语音合成(因为它不受支持),而是在服务器上进行,并返回该单词的音节边界时间和 IPA 转录在音频响应的 HTTP 标头中。 我可能根本不应该提及 silverlight 部分,因为它并不真正相关,我只是没有真正考虑解释上下文。 哎呀。 :)

I am creating a small silverlight widget which pronounces a word and highlights each syllable as it is pronounced.

As part of this, I am using the SpeechSynthesizer.PhonemeReached event to determine the start and end times of each phoneme (as a step in figuring out the start and end times of each syllable).

The weird thing is that the PhonemeReachedEventArgs.Phoneme property is sometimes a control character, at least (but possibly not exclusively) U+0004 END OF TRANSMISSION, including places which are not, umm, the end of the transmission (like, say, the beginning).

I can't find any documentation of what this is supposed to mean, does anyone know?

EDIT: To clarify, I'm not doing the speech synthesis in Silverlight (since that isn't supported), I am doing it on the server and returning the syllable boundary times and IPA transcription of the word in an HTTP header of the audio response. I probably shouldn't have mentioned the silverlight part at all, since it isn't really relevant, I just didn't really think much about explaining the context. Ooops. :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

┾廆蒐ゝ 2024-07-15 13:44:02

如果您试图找出每个音素的开始和结束时间(这确实是一个不同的问题)...尝试使用 PhonemeReachedEventArgs.Duration 属性。 开始时间将是“短语的开始时间”+所有先前发音的音素的累积持续时间。 结束时间为“当前音素的开始时间”+当前音素的持续时间。

至于回答您的“发布”问题,我假设您使用英语作为综合语言。 在这种情况下,您看到的“字符”可能实际上是 美式英语音素表。 您也可能会在中文音素中看到同样的情况,但是日语音素具有unicode表示形式,都在“控制字符”范围之外。

If you are trying to figure out the start and end times of each phoneme (which really is a different question)... try using the PhonemeReachedEventArgs.Duration property. The start time will be the "start time of the phrase" + the cumulative durations of all previously pronounced phonemes. The end time will be the "start time of the current phoneme" + the duration of the current phoneme.

As to answer your "posted" question, I am assuming that you are using English as your language of synthesis. In this case, it is likely that the "characters" you are seeing are actually index values to the American English Phoneme Table. You may also see the same with Chinese phonemes, however Japanese phonemes have unicode representations, which are all outside of the "control character" range.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文