在 Python 中标记一个保留分隔符的字符串
Python 中是否有与 str.split
等效的东西也返回分隔符?
在处理一些标记后,我需要保留输出的空白布局。
示例:
>>> s="\tthis is an example"
>>> print s.split()
['this', 'is', 'an', 'example']
>>> print what_I_want(s)
['\t', 'this', ' ', 'is', ' ', 'an', ' ', 'example']
谢谢!
Is there any equivalent to str.split
in Python that also returns the delimiters?
I need to preserve the whitespace layout for my output after processing some of the tokens.
Example:
>>> s="\tthis is an example"
>>> print s.split()
['this', 'is', 'an', 'example']
>>> print what_I_want(s)
['\t', 'this', ' ', 'is', ' ', 'an', ' ', 'example']
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
re
模块提供了此功能:(引自 Python 文档)。
对于您的示例(按空格分割),请使用
re.split('(\s+)', '\tThis is an example')
。关键是将要分割的正则表达式括在捕获括号中。这样,分隔符就会添加到结果列表中。
编辑:正如所指出的,任何前置/尾随分隔符当然也会添加到列表中。为了避免这种情况,您可以首先在输入字符串上使用
.strip()
方法。the
re
module provides this functionality:(quoted from the Python documentation).
For your example (split on whitespace), use
re.split('(\s+)', '\tThis is an example')
.The key is to enclose the regex on which to split in capturing parentheses. That way, the delimiters are added to the list of results.
Edit: As pointed out, any preceding/trailing delimiters will of course also be added to the list. To avoid that you can use the
.strip()
method on your input string first.你看过 pyparsing 吗?借自 pyparsing wiki 的示例:
Have you looked at pyparsing? Example borrowed from the pyparsing wiki:
感谢大家指出
re
模块,我仍在尝试在它和使用我自己的返回序列的函数之间做出决定...如果我有时间,我会对它们进行基准测试 xD
Thanks guys for pointing for the
re
module, I'm still trying to decide between that and using my own function that returns a sequence...If I had time I'd benchmark them xD
怎么样
How about