如何制作像 IsWordPronounceable(SomeWord:String): boolean; 这样的函数

发布于 2024-09-02 23:47:05 字数 215 浏览 12 评论 0原文

我想做一个函数 IsWordPronounceable(SomeWord:String): boolean; “英语” 我正在使用 SAPI 语音识别,我需要这个功能。我使用delphi编译器,C/C#/C++或任何语言都可以..请帮忙。我不知道如何开始...

从一开始,我认为添加语法规则可以解决问题。该场景是突出显示对用户所说的文本。但引擎无法识别不发音的单词。

I would like to make a function IsWordPronounceable(SomeWord:String): boolean; "english language"
i am working with SAPI Speech Recognition and i need this function. I use delphi compiler, C/C#/C++ or any language is ok.. please help. i dont know how to start...

from the start, i thought adding grammar rule could solve the problem. the scenario is highlight the text that is being said to the user. but the engine cannot recognize the words that is not pronounceble.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

云裳 2024-09-09 23:47:05

这并不容易做到。我的方法是通过一些简单的统计分析。

首先下载一本英语单词词典(或者任何语言,实际上 - 你只需要一本“可发音”的单词词典)。然后,取出字典中的每个单词并将其分成 3 个字母的块。因此,对于“dictionary”这个词,您可以将其分解为“dic”、“ict”、“cti”、“tio”、“ion”、“ona”、“nar”和“ary”。然后将字典中所有单词中的每个三个字母块添加到一个集合中,该集合将三个字母块映射到它出现的次数。像这样的事情:

“迪克”-> 36365
“信息通信技术”-> 2721
“cti”-> 532

依此类推... 接下来,通过将每个数字除以字典中的单词总数来标准化数字。这样,您就可以将三个字母的组合映射到字典中包含该三个字母组合的单词的百分比。

最后,实现您的 IsWordPronounceable 方法,如下所示:

bool IsWordPronounceable(string word)
{
    string[] threeLetterBlocks = BreakIntoThreeLetterBlocks(word);
    foreach(string block in threeLetterBlocks)
    {
        if (blockFrequency[block] < THRESHOLD)
            return false;
    }
    return true;
}

显然,您需要“调整”一些参数。 THRESHOLD 参数是一个,块的大小可能最好是 2、3 或 4 等。我认为,需要一些调整才能使其正确。

This is not exactly easy to do. The way I would do it is with some simple statistical analysis.

Start off by downloading a dictionary of English words (or any language, really - you just need a dictionary of words that are "pronounceable"). Then, take each word in the dictionary and break it up into 3-letter blocks. So given the word "dictionary", you'd break it up into "dic", "ict", "cti", "tio", "ion", "ona", "nar", and "ary". Then add each three-letter block from all the words in the dictionary into a collection that maps the three letter block to the number of times it appears. Something like this:

"dic" -> 36365
"ict" -> 2721
"cti" -> 532

And so on... Next, normalize the numbers by dividing each number by the total number of words in the dictionary. That way, you have a mapping of three-letter combinations to the percentage of words in the dictionary that contain that three letter combination.

Finally, implement your IsWordPronounceable method something like this:

bool IsWordPronounceable(string word)
{
    string[] threeLetterBlocks = BreakIntoThreeLetterBlocks(word);
    foreach(string block in threeLetterBlocks)
    {
        if (blockFrequency[block] < THRESHOLD)
            return false;
    }
    return true;
}

Obviously, there's a few parameters you'll want to "tune". The THRESHOLD parameter is one, also the size of the blocks might be better off being 2 or 3 or 4, etc. It'll take a bit of massaging around to get it right, I think.

凝望流年 2024-09-09 23:47:05

只是一个想法(也许很疯狂):我从未尝试过。
您可以将文本转语音的输出输入到语音转文本的输入中吗?
那么在完美的世界中,任何无法识别(或不匹配)的东西最终都是无法发音的。

Just an idea (maybe crazy): I've never tried that.
Can you feed the output of the Text-To-Speech into the input of the Speech-To-Text?
Then in a perfect world, anything not recognized (or not matching) in the end is not pronounceable.

暖伴 2024-09-09 23:47:05

此功能通常由语音引擎本身处理。如果您的目标只是让文本转语音引擎来发音和拼写其他内容,那么除默认值之外的语音引擎可能就足够了。例如,查看 Acapela

为了自己编写这个功能,我会首先实现容易实现的目标。

  • 检查输入
    数字/不可发音的字符,
    如果找到则失败
  • 检查输入
    单词字典,如果找到则通过

类似于 codeka 解决方案的更高级技术是构建有效音节模式列表,然后将您的输入与它们进行匹配。可能还有更复杂的技术,但要做到这一点,您需要熟悉语言学

This functionality is typically handled by the speech engine itself. If your goal is simply to get the text-to-speech engine to pronounce some things and spell others, speech engines other than the default may do a sufficient job. Check out Acapela for example.

To write this functionality yourself, I'd hit the low hanging fruit first.

  • check the input for
    numbers/unpronounceable characters,
    fail if found
  • check the input against
    a dictionary of words, pass if found

A more advanced technique similar to codeka's solution would be to build a list of valid syllable patterns then match your input against them. There may be even more complex techniques, but to go there you need to become familiar with linguistics.

又爬满兰若 2024-09-09 23:47:05

这意味着您不能仅使用文本转语音,但您还需要检查给出的单词是否符合该语言。此外,您还需要对文本转语音数据使用训练引擎之类的东西。这样这些数据就可以用于您的功能。

如果您只想检查单词的正确性(我的意思是没有语音,只检查单词的有效性),那么codeka 非常酷。您可以从特定语言的词典中查看它。

谢谢。

This means you can't use only text-to-speech but you also needs to check that the words given are fine as per the language or not. Also you need to use the training engine kind of thing for text-to-speech data. So that that data will be usable for your function.

If you only want to check the correctness of the word (I mean no speech, only check the validity of word), than the answer given by codeka is quite cool. You can check it from the dictionary of particular language.

thanks.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文