如果子字符串重复出现,如何在字符串中获取周围的substring单词?

发布于 2025-02-12 01:05:38 字数 613 浏览 2 评论 0原文

我有一项任务,我需要在字符串中每个子字符串(可能是多个单词)之前和之后获取n个单词。我最初考虑使用str.split(“”)并与列表一起工作,但问题是我要获取一个可以是多个单词的子字符串。

我已经尝试使用str.partition,并且非常接近做我想要的事情,但它只会获得第一个关键字。

代码:

text = "Hello World how are you doing Hello is the keyword I'm trying to get Hello is a repeating word"
part = text.partition("Hello")
part = list(map(str.strip, part))

输出:

['', 'Hello', "World how are you doing Hello is the keyword I'm trying to get Hello is a repeating word"]

这使我完全需要第一个关键字。我有足够的时间得到先前和后词。不幸的是,当我要寻找的子字符串重复时,这使我失败了。

如果输出可以是列表分区的列表,那么我实际上可以使其正常工作。我应该如何处理?

I have a task where I need to fetch N words before and after every substring (could be multiple words) in a string. I initially considered using str.split(" ") and work with the list but the issue is I'm fetching a substring which can be multiple words.

I've tried using str.partition and its very close to doing exactly what I want but it only gets the first keyword.

Code:

text = "Hello World how are you doing Hello is the keyword I'm trying to get Hello is a repeating word"
part = text.partition("Hello")
part = list(map(str.strip, part))

Output:

['', 'Hello', "World how are you doing Hello is the keyword I'm trying to get Hello is a repeating word"]

This gets me exactly what I need for the first keyword. I have enough to then get the prior and posterior words. Unfortunately, this fails me when the substring I'm looking for is repeating.

If the output could instead be a list of list partitions then I could actually make it work. How should I approach this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

美人骨 2025-02-19 01:05:38
text = "Hello World how are you doing Hello is the keyword I'm trying to get Hello is a repeating word"

def recursive_partition(text, pattern):
  if not text:
    return text
  tmp = text.partition(pattern)
  if tmp and tmp[1]:
    return [tmp[0]] + [tmp[1]] + recursive_partition(tmp[2], pattern)
  else:
    return [tmp[0]]

res = recursive_partition(text, "Hello")
print(res)  # ['', 'Hello', ' World how are you doing ', 'Hello', " is the keyword I'm trying to get ", 'Hello', ' is a repeating word']
text = "Hello World how are you doing Hello is the keyword I'm trying to get Hello is a repeating word"

def recursive_partition(text, pattern):
  if not text:
    return text
  tmp = text.partition(pattern)
  if tmp and tmp[1]:
    return [tmp[0]] + [tmp[1]] + recursive_partition(tmp[2], pattern)
  else:
    return [tmp[0]]

res = recursive_partition(text, "Hello")
print(res)  # ['', 'Hello', ' World how are you doing ', 'Hello', " is the keyword I'm trying to get ", 'Hello', ' is a repeating word']
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文