Python:在字符串中找到特定单词号的开始索引
我有一个字符串:
myString = "Tomorrow will be very very rainy"
我想获得单词编号5(非常)的启动索引。
我目前要做的是,我确实将我分为单词:
words = re.findall( r'\w+|[^\s\w]+', myString)
但是我不确定如何获取单词数字5:单词[5]的开始索引。
使用index()不起作用,因为它发现了第一次出现:
start_index = myString.index(words[5])
I have this string:
myString = "Tomorrow will be very very rainy"
I would like to get the start index of the word number 5 (very).
What I do currently, I do split myString into words:
words = re.findall( r'\w+|[^\s\w]+', myString)
But I am not sure on how to get the start index of the word number 5: words[5].
Using the index() is not working as it finds the first occurrence:
start_index = myString.index(words[5])
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
不是很优雅,而是通过拆分单词列表进行循环,并根据单词长度和拆分字符计算索引(在这种情况下为空间)。该答案将针对句子中的第五个单词。
Not very elegant, but loop through the list of split words and calculate the index based on the word length and the split character (in this case a space). This answer will target the fifth word in the sentence.
输出
output
如果单词之间只有单个空间:
之前:
If only single spaces between words:
Result:
如果字符串以5个单词开头,则可以匹配前4个单词并捕获第五个单词。
您可以使用
start
方法,然后将1传递给匹配对象。说明
^
字符串的开始(?:\ w+ \ s+){4}
(\ w+)
捕获组1,匹配1+字字符的示例
输出
更广泛的匹配您可以使用
\ s+
匹配一个或多个非空格字符。If the string starts with 5 words, you can match the first 4 words and capture the fifth one.
The you can use the
start
method and pass 1 to it for the first capture group of the Match Object.Explanation
^
Start of string(?:\w+\s+){4}
Repeat 4 times matching 1+ word characters and 1+ whitspace chars(\w+)
Capture group 1, match 1+ word charactersExample
Output
For a broader match you can use
\S+
to match one or more non whitespace characters.