Python：在字符串中找到特定单词号的开始索引

发布于 2025-01-28 17:41:44 字数 344 浏览 2 评论 0原文

我有一个字符串：

myString = "Tomorrow will be very very rainy"

我想获得单词编号5（非常）的启动索引。

我目前要做的是，我确实将我分为单词：

words = re.findall( r'\w+|[^\s\w]+', myString)

但是我不确定如何获取单词数字5：单词[5]的开始索引。

使用index（）不起作用，因为它发现了第一次出现：

start_index = myString.index(words[5])

原文

I have this string:

myString = "Tomorrow will be very very rainy"

I would like to get the start index of the word number 5 (very).

What I do currently, I do split myString into words:

words = re.findall( r'\w+|[^\s\w]+', myString)

But I am not sure on how to get the start index of the word number 5: words[5].

Using the index() is not working as it finds the first occurrence:

start_index = myString.index(words[5])

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

南笙 2025-02-04 17:41:44

不是很优雅，而是通过拆分单词列表进行循环，并根据单词长度和拆分字符计算索引（在这种情况下为空间）。该答案将针对句子中的第五个单词。

myString = "Tomorrow will be very very rainy"

target_word = 5

split_string = myString.split()

idx_start = 0

for i in range(target_word-1):
    idx_start += len(split_string[i])
    if myString[idx_start] == " ":
        idx_start += 1

idx_end = idx_start + len(split_string[target_word-1]) + 1

print(idx_start, idx_end, myString[idx_start:idx_end])

Not very elegant, but loop through the list of split words and calculate the index based on the word length and the split character (in this case a space). This answer will target the fifth word in the sentence.

myString = "Tomorrow will be very very rainy"

target_word = 5

split_string = myString.split()

idx_start = 0

for i in range(target_word-1):
    idx_start += len(split_string[i])
    if myString[idx_start] == " ":
        idx_start += 1

idx_end = idx_start + len(split_string[target_word-1]) + 1

print(idx_start, idx_end, myString[idx_start:idx_end])

回复收藏 0 原文

深海蓝天 2025-02-04 17:41:44

wordnum = 5
l = [x.span()[1] for x in re.finditer(" +", string)]
pos = l[wordnum-2]
print(pos)

输出

wordnum = 5
l = [x.span()[1] for x in re.finditer(" +", string)]
pos = l[wordnum-2]
print(pos)

output

回复收藏 0 原文

只是我以为 2025-02-04 17:41:44

如果单词之间只有单个空间：

总结所有单词长度在想要的单词
添加空间数量

word_idx = 4  # zero based index
words = myString.split()
start_index = sum(len(word) for word in words[:word_idx]) + word_idx

之前：

If only single spaces between words:

Sum all word lengths before the wanted word
Add amount of spaces

word_idx = 4  # zero based index
words = myString.split()
start_index = sum(len(word) for word in words[:word_idx]) + word_idx

Result:

回复收藏 0 原文

岁月苍老的讽刺 2025-02-04 17:41:44

如果字符串以5个单词开头，则可以匹配前4个单词并捕获第五个单词。

您可以使用start方法，然后将1传递给匹配对象。

^(?:\w+\s+){4}(\w+)

说明

^字符串的开始
（？：\ w+ \ s+）{4}
（\ w+）捕获组1，匹配1+字字符的

示例

import re

myString = "Tomorrow will be very very rainy"
pattern = r"^(?:\w+\s+){4}(\w+)"
m = re.match(pattern, myString)
if m:
    print(m.start(1))

输出

更广泛的匹配您可以使用\ s+匹配一个或多个非空格字符。

pattern = r"^(?:\S+\s+){4}(\S+)"

If the string starts with 5 words, you can match the first 4 words and capture the fifth one.

The you can use the start method and pass 1 to it for the first capture group of the Match Object.

^(?:\w+\s+){4}(\w+)

Explanation

^ Start of string
(?:\w+\s+){4} Repeat 4 times matching 1+ word characters and 1+ whitspace chars
(\w+) Capture group 1, match 1+ word characters

Example

import re

myString = "Tomorrow will be very very rainy"
pattern = r"^(?:\w+\s+){4}(\w+)"
m = re.match(pattern, myString)
if m:
    print(m.start(1))

Output

For a broader match you can use \S+ to match one or more non whitespace characters.

pattern = r"^(?:\S+\s+){4}(\S+)"

回复收藏 0 原文

~没有更多了~

关于作者

古镇旧梦

暂无简介

文章

27 人气

关注发私信

李珊平

文章 0 评论 0

关注

Quxin

文章 0 评论 0

关注

范无咎

文章 0 评论 0

关注

github_ZOJ2N8YxBm

文章 0 评论 0

关注

若言

文章 0 评论 0

关注

南…巷孤猫

文章 0 评论 0

友情链接

文江博客

Python：在字符串中找到特定单词号的开始索引

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

李珊平

Quxin

范无咎

github_ZOJ2N8YxBm

若言

南…巷孤猫

友情链接

Python：在字符串中找到特定单词号的开始索引

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

李珊平

Quxin

范无咎

github_ZOJ2N8YxBm

若言

南…巷孤猫

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。