衡量一个单词的发音?
我正在摆弄域名查找器,并希望选择那些易于发音的单词。
示例:nameoic.com(不好)与 namelet.com(好)。
我认为与 soundex 有关的事情可能是合适的,但看起来我不能用它们来产生某种比较分数。
win的PHP代码。
I'm tinkering with a domain name finder and want to favour those words which are easy to pronounce.
Example: nameoic.com (bad) versus namelet.com (good).
Was thinking something to do with soundex may be appropriate but it doesn't look like I can use them to produce some sort of comparative score.
PHP code for the win.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是一个应该与最常见的单词一起工作的函数...它应该给你一个介于 1(根据规则的完美发音)到 0 之间的不错的结果。
以下函数远非完美(它不太喜欢单词)如海啸 [0.857])。 但根据您的需求进行调整应该相当容易。
Here is a function which should work with the most common of words... It should give you a nice result between 1 (perfect pronounceability according to the rules) to 0.
The following function far from perfect (it doesn't quite like words like Tsunami [0.857]). But it should be fairly easy to tweak for your needs.
我认为问题可以归结为将单词解析为 音素 的候选集,然后使用预先确定的音素对列表,以确定该单词的发音。
例如:“技能”的发音是“/s/k/i/l/”。 “/s/k/”、“/k/i/”、“/i/l/”都应该具有高发音分数,因此该单词应该得分高。
“skpit”的发音是“/s/k/p/i/t/”。 “/k/p/”的发音分数应该较低,因此该单词的分数应该较低。
I think the problem could be boiled down to parsing the word into a candidate set of phonemes, then using a predetermined list of phoneme pairs to determine how pronouncible the word is.
For example: "skill" phonetically is "/s/k/i/l/". "/s/k/", "/k/i/", "/i/l/" should all have high scores of pronouncibility, so the word should score highly.
"skpit" phonetically is "/s/k/p/i/t/". "/k/p/" should have a low pronouncibility score, so the word should score low.
使用马尔可夫模型(当然,针对字母,而不是单词)。 单词的概率是发音难易度的一个很好的指标。 您必须对长度进行标准化,因为较长的单词本质上不太可能。
Use a Markov model (on letters, not words, of course). The probability of a word is a pretty good proxy for ease of pronunciation. You'll have to normalize for length, since longer words are inherently less probable.