Python - 正则表达式 - 在单词之前分割字符串
我试图在特定单词之前拆分 python 中的字符串。例如,我想在 "path:"
之前分割以下字符串。
- 在
"path:"
输入之前分割字符串 - :
"path:bte00250 丙氨酸、天冬氨酸和谷氨酸代谢路径:bte00330 精氨酸和脯氨酸代谢"
- 输出:
['path: bte00250 丙氨酸、天冬氨酸和谷氨酸代谢', 'path:bte00330 精氨酸和脯氨酸代谢']
我有尝试过
rx = re.compile("(:?[^:]+)")
rx.findall(line)
这不会在任何地方分割字符串。问题在于 "path:"
之后的值永远无法指定整个单词。有谁知道该怎么做?
I am trying to split a string in python before a specific word. For example, I would like to split the following string before "path:"
.
- split string before
"path:"
- input:
"path:bte00250 Alanine, aspartate and glutamate metabolism path:bte00330 Arginine and proline metabolism"
- output:
['path:bte00250 Alanine, aspartate and glutamate metabolism', 'path:bte00330 Arginine and proline metabolism']
I have tried
rx = re.compile("(:?[^:]+)")
rx.findall(line)
This does not split the string anywhere. The trouble is that the values after "path:"
will never be known to specify the whole word. Does anyone know how to do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
使用正则表达式来分割字符串似乎有点大材小用:字符串
split()
方法可能正是您所需要的。无论如何,如果您确实需要匹配正则表达式才能分割字符串,则应该使用
re.split()
方法,根据正则表达式匹配拆分字符串。另外,使用正确的正则表达式进行拆分:
(?=...)
组是一个先行断言:表达式匹配空格 (注意表达式开头的空格) 后跟字符串'path:'
,不消耗空格后面的内容。using a regular expression to split your string seems a bit overkill: the string
split()
method may be just what you need.anyway, if you really need to match a regular expression in order to split your string, you should use the
re.split()
method, which splits a string upon a regular expression match.also, use a correct regular expression for splitting:
the
(?=...)
group is a lookahead assertion: the expression matches a space (note the space at the start of the expression) which is followed by the string'path:'
, without consuming what follows the space.您可以执行
["path:"+s for s in line.split("path:")[1:]]
而不是使用正则表达式。 (请注意,我们跳过第一个没有“path:”前缀的匹配。You could do
["path:"+s for s in line.split("path:")[1:]]
instead of using a regex. (note that we skip first match, that has no "path:" prefix.这可以在没有正则表达式的情况下完成。给定一个字符串:
我们可以暂时用占位符替换所需的单词。占位符是单个字符,我们用它来分割:
现在字符串已分割,我们可以使用列表理解将原始单词重新连接到每个子字符串:
This can be done without regular expressons. Given a string:
We can temporarily replace the desired word with a placeholder. The placeholder is a single character, which we use to split by:
Now that the string is split, we can rejoin the original word to each sub-string using a list comprehension: