计算文件中单词的音节数的代码
到目前为止,我有以下代码来计算 cmudict(CMU 发音词典)中单词的音节数。它计算字典中所有单词的音节数。现在我需要用输入文件替换 cmudict 并查找作为输出打印的文件中每个单词的音节数。仅以读取模式打开输入文件是行不通的,因为 dict() 无法作为文件的属性提供。 代码如下:
from curses.ascii import isdigit from nltk.corpus import cmudict d = cmudict.dict() # get the CMU Pronouncing Dict def nsyl(word): """return the max syllable count in the case of multiple pronunciations""" return max([len([y for y in x if isdigit(y[-1])]) for x in d[word.lower()]]) w_words = dict([(w, nsyl(w)) for w in d.keys() if w[0] == 'a'or'z']) worth_abbreviating = [(k,v) for (k,v) in w_words.iteritems() if v > 3] print worth_abbreviating
有人可以帮我吗?
I have the following piece of code so far to count the number of syllables in the words in the cmudict ( CMU pronunciation dictionary). It counts the number of syllables for all the words in the dictionary. Now I need to replace cmudict with my input file and find the number of syllables for each word in the file which is printed as output. Just opening the input file in read mode does not work as dict() cannot be provided as the attribute to the file.
The code is given below :
from curses.ascii import isdigit from nltk.corpus import cmudict d = cmudict.dict() # get the CMU Pronouncing Dict def nsyl(word): """return the max syllable count in the case of multiple pronunciations""" return max([len([y for y in x if isdigit(y[-1])]) for x in d[word.lower()]]) w_words = dict([(w, nsyl(w)) for w in d.keys() if w[0] == 'a'or'z']) worth_abbreviating = [(k,v) for (k,v) in w_words.iteritems() if v > 3] print worth_abbreviating
Can anyone please help me out?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不确定这是否能解决整个问题,但是:
可能应该是
因为
if w[0] == 'a'or'z'
意味着if (w[0] == 'a ') 或 ('z')
。字符串'z'
为 True,因此条件始终为 True。例如,
Not sure this will solve the whole problem, but:
should probably be
since
if w[0] == 'a'or'z'
meansif (w[0] == 'a') or ('z')
. The string'z'
is Truish, so the condition is always True.For example,