没有 pos 的句子 - Python
我已经对文本进行了标记,并希望在没有 pos 的情况下打印句子的错误,但它会为每个句子打印错误。我应该如何改变它?
sents = nltk.sent_tokenize(text)
for sent in sents:
tokens = nltk.word_tokenize(sent)
tagged = nltk.pos_tag(tokens)
for pos in tagged:
if 'VB' not in sents :
print('error')
I have tokenized the text and want to print error for the sentences without a pos but it prints error for every single sentence. How should I change it?
sents = nltk.sent_tokenize(text)
for sent in sents:
tokens = nltk.word_tokenize(sent)
tagged = nltk.pos_tag(tokens)
for pos in tagged:
if 'VB' not in sents :
print('error')
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
正如@baileythegreen指出的那样,以为您的最后一个条件仅在标记秘诀之后,
如果不是sents:
正在检查整个标记文本。即使您正在迭代的迭代中,每个迭代都会返回,其中包含“ vb”标签。
您可能应该使用标志Eg eg has_vb = false,并且条件应为
,如果未在标记[1]中'vb':has_vb = true
如果has_vb:print(error),for循环完成后
As @baileythegreen pointed out, thought your last condition is only after tagging the sents,
if 'VB' not in sents:
is checking the entire tokenized text.which returns true for every iteration even if the sent you're iterating over has a 'VB' tag in it.
you probably should use a flag E.g. has_VB = False and the condition should be
if 'VB' not in tagged[1]: has_VB = True
and after the for loop is finished
if has_VB: print(error)