没有 pos 的句子 - Python

发布于 2025-01-18 15:35:40 字数 346 浏览 1 评论 0原文

我已经对文本进行了标记,并希望在没有 pos 的情况下打印句子的错误,但它会为每个句子打印错误。我应该如何改变它?

sents = nltk.sent_tokenize(text)

for sent in sents:
    tokens = nltk.word_tokenize(sent)
    tagged = nltk.pos_tag(tokens)
    
    for pos in tagged:        
        if 'VB' not in sents :
             print('error')

I have tokenized the text and want to print error for the sentences without a pos but it prints error for every single sentence. How should I change it?

sents = nltk.sent_tokenize(text)

for sent in sents:
    tokens = nltk.word_tokenize(sent)
    tagged = nltk.pos_tag(tokens)
    
    for pos in tagged:        
        if 'VB' not in sents :
             print('error')

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

路弥 2025-01-25 15:35:40
text = "this sentence has verb. this one not"
sents = nltk.sent_tokenize(text)

for sent in sents:
    has_verb = False
    tokens = nltk.word_tokenize(sent)
    pos_tags = nltk.pos_tag(tokens)
    for pos_tag in pos_tags:        
        if 'VB' in pos_tag[1] :
            has_verb=True
            break
    if not has_verb:
        print(f'error: "{sent}" does not have verb')
text = "this sentence has verb. this one not"
sents = nltk.sent_tokenize(text)

for sent in sents:
    has_verb = False
    tokens = nltk.word_tokenize(sent)
    pos_tags = nltk.pos_tag(tokens)
    for pos_tag in pos_tags:        
        if 'VB' in pos_tag[1] :
            has_verb=True
            break
    if not has_verb:
        print(f'error: "{sent}" does not have verb')
素罗衫 2025-01-25 15:35:40

正如@baileythegreen指出的那样,以为您的最后一个条件仅在标记秘诀之后,如果不是sents:正在检查整个标记文本。
即使您正在迭代的迭代中,每个迭代都会返回,其中包含“ vb”标签。
您可能应该使用标志Eg eg has_vb = false,并且条件应为,如果未在标记[1]中'vb':has_vb = true
如果has_vb:print(error),for循环完成后

As @baileythegreen pointed out, thought your last condition is only after tagging the sents, if 'VB' not in sents: is checking the entire tokenized text.
which returns true for every iteration even if the sent you're iterating over has a 'VB' tag in it.
you probably should use a flag E.g. has_VB = False and the condition should beif 'VB' not in tagged[1]: has_VB = True
and after the for loop is finished if has_VB: print(error)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文