读取从子弦开始的所有单词

发布于 2025-02-11 14:27:12 字数 713 浏览 0 评论 0原文

我有CSV文件，其中给出了关键字。我必须从文本中的关键字开始匹配所有单词。

text = "1 who were randomized 1 1 to daro 600 mg twice daily or matching pbo in addition to adt docetaxel randomization was stratifi
ed by extent of disease according to tnm m1a vs m1b vs m1c and alkaline phosphatase levels vs ≥ upper limit of normal the primary endpoint was os secondary efficac
y endpoints included time to crpc time to pain progression time to first symptomatic skeletal event sse and time to initiation of subsequent systemic antineoplasti
c therapies safety was also assessed resu from nov 2016 to june 2018 1306 pts were randomized 651 to daro"

keyword = ["random*"]

因此，在这里，我想阅读所有以 Random *开始的单词

原文

I have csv file in which keywords are given. I have to match all the words starting with the keywords from the text.

text = "1 who were randomized 1 1 to daro 600 mg twice daily or matching pbo in addition to adt docetaxel randomization was stratifi
ed by extent of disease according to tnm m1a vs m1b vs m1c and alkaline phosphatase levels vs ≥ upper limit of normal the primary endpoint was os secondary efficac
y endpoints included time to crpc time to pain progression time to first symptomatic skeletal event sse and time to initiation of subsequent systemic antineoplasti
c therapies safety was also assessed resu from nov 2016 to june 2018 1306 pts were randomized 651 to daro"

keyword = ["random*"]

So here I want to read all the words starting with random*

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

零時差 2025-02-18 14:27:12

使用re.findall以及Regex模式\ brandom \ w*：

text = "1 who were randomized 1 1 to daro 600 mg twice daily or matching pbo in addition to adt docetaxel randomization was stratified by extent of disease according to tnm m1a vs m1b vs m1c and alkaline phosphatase levels vs ≥ upper limit of normal the primary endpoint was os secondary efficacy endpoints included time to crpc time to pain progression time to first symptomatic skeletal event sse and time to initiation of subsequent systemic antineoplastic therapies safety was also assessed resu from nov 2016 to june 2018 1306 pts were randomized 651 to daro"

keywords = ["random"]
regex = r'\b(?:' + r'|'.join(keywords) + ')\w*'
matches = re.findall(regex, text)
print(matches)  # ['randomized', 'randomization', 'randomized']

Use re.findall along with the regex pattern \brandom\w*:

text = "1 who were randomized 1 1 to daro 600 mg twice daily or matching pbo in addition to adt docetaxel randomization was stratified by extent of disease according to tnm m1a vs m1b vs m1c and alkaline phosphatase levels vs ≥ upper limit of normal the primary endpoint was os secondary efficacy endpoints included time to crpc time to pain progression time to first symptomatic skeletal event sse and time to initiation of subsequent systemic antineoplastic therapies safety was also assessed resu from nov 2016 to june 2018 1306 pts were randomized 651 to daro"

keywords = ["random"]
regex = r'\b(?:' + r'|'.join(keywords) + ')\w*'
matches = re.findall(regex, text)
print(matches)  # ['randomized', 'randomization', 'randomized']

回复收藏 0 原文

~没有更多了~