在第一个发现时搜索字符串停止中的关键字

发布于 2025-02-08 20:31:57 字数 854 浏览 3 评论 0原文

我目前正在尝试在字符串/文本中搜索关键字 +以下单词。我想将这些关键字存储在已经创建的文本文件（f）中。到目前为止，我已经写了此代码：

def keyword_extraction(text, keyword_list, f):

    temp = re.findall(r"[\w']+", text)

    for keyword in keyword_list:
        if keyword in temp:
            results = [temp[temp.index(keyword) + 1]]
            for word in results:
                f.writelines(keyword + ': ' + word + '\n')
        else:
            f.writelines('Keyword "' + keyword + '" not found\n')

问题是，每当找到关键字时，算法就会停止。但是我想提取所有关键字，因此，当它们在文本中两次出现时，应将其写入两次。您对如何解决这个问题有任何建议吗？

示例输入：

text = "today is a sunny day dont you think? I like this day very much"
keyword_list = ['like', 'day']

预期输出：

like: this
day: dont
day: very

实际输出：

like: this
day: dont

谢谢您的帮助！

原文

I'm currently trying to search for keywords + the following word in a string/text. I want to store these keywords in an already created text file (f). I have wrote this code so far:

def keyword_extraction(text, keyword_list, f):

    temp = re.findall(r"[\w']+", text)

    for keyword in keyword_list:
        if keyword in temp:
            results = [temp[temp.index(keyword) + 1]]
            for word in results:
                f.writelines(keyword + ': ' + word + '\n')
        else:
            f.writelines('Keyword "' + keyword + '" not found\n')

The problem is, whenever the keyword is found, the algorithm stops. But I want to extract all of the keywords, so when they appear twice in a text, they should be written down twice. Do you have any suggestions of how I can fix that?

Example input:

text = "today is a sunny day dont you think? I like this day very much"
keyword_list = ['like', 'day']

Expected output:

like: this
day: dont
day: very

actual output:

like: this
day: dont

Thank you for your help!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

还在原地等你 2025-02-15 20:31:57

text = "today is a sunny day dont you think? I like this day very much"
keyword_list = ['like', 'day']

splitted_text = text.split()
for index, word in enumerate(splitted_text):
    if word in keyword_list:
        print(f'{word}: {splitted_text[index+1]}')

输出：

day: dont
like: this
day: very

text = "today is a sunny day dont you think? I like this day very much"
keyword_list = ['like', 'day']

splitted_text = text.split()
for index, word in enumerate(splitted_text):
    if word in keyword_list:
        print(f'{word}: {splitted_text[index+1]}')

Output:

day: dont
like: this
day: very

回复收藏 0 原文

忘年祭陌 2025-02-15 20:31:57

您可以通过循环和删除从搜索列表中找到的每个关键字的第一次出现，直到不再剩下。

import re

def keyword_extraction(text, keyword_list, f):
    temp = re.findall(r"[\w']+", text)

    for keyword in keyword_list:
        found = False
        while keyword in temp:
            found = True
            try:
                next_word = temp[temp.index(keyword) + 1]
            except IndexError:
                next_word = ''
            f.writelines(keyword + ': ' + next_word + '\n')
            temp.remove(keyword)
        if not found:
            f.writelines('Keyword "' + keyword + '" not found\n')

text = "today is a sunny day dont you think? I like this day very much"
keyword_list = ['like', 'day', 'much']

with open('keyword_search_results.txt', 'w') as f:
    keyword_extraction(text, keyword_list, f)

print('fini')

You can do it by looping and removing the first occurrence of each keyword found from the list being searched until no more are left.

import re

def keyword_extraction(text, keyword_list, f):
    temp = re.findall(r"[\w']+", text)

    for keyword in keyword_list:
        found = False
        while keyword in temp:
            found = True
            try:
                next_word = temp[temp.index(keyword) + 1]
            except IndexError:
                next_word = ''
            f.writelines(keyword + ': ' + next_word + '\n')
            temp.remove(keyword)
        if not found:
            f.writelines('Keyword "' + keyword + '" not found\n')

text = "today is a sunny day dont you think? I like this day very much"
keyword_list = ['like', 'day', 'much']

with open('keyword_search_results.txt', 'w') as f:
    keyword_extraction(text, keyword_list, f)

print('fini')

回复收藏 0 原文

~没有更多了~