如何通过正则表达式分割这个字符串?
我有一些字符串,它们看起来像:
div#title.title.top
#main.main
a.bold#empty.red
它们与 haml 类似,我想通过正则表达式分割它们,但我不知道如何定义它。
val r = """???""".r // HELP
val items = "a.bold#empty.red".split(r)
items // -> "a", ".bold", "#empty", ".red"
如何做到这一点?
更新
抱歉,大家,但我需要让这个问题变得更难。我很感兴趣
val r = """(?<=\w)\b"""
但它无法解析更复杂的:
div#question-title.title-1.h-222_333
我希望它将被解析为:
div
#question-title
.title-1
.h-222_333
我想知道如何改进该正则表达式?
I have some string, they looks like:
div#title.title.top
#main.main
a.bold#empty.red
They are similar to haml, and I want to split them by regex, but I don't know how to define it.
val r = """???""".r // HELP
val items = "a.bold#empty.red".split(r)
items // -> "a", ".bold", "#empty", ".red"
How to do this?
UPDATE
Sorry, everyone, but I need to make this question harder. I'm very interested in
val r = """(?<=\w)\b"""
But it failed to parse the more complex ones:
div#question-title.title-1.h-222_333
I hope it will be parsed to:
div
#question-title
.title-1
.h-222_333
I wanna know how to improve that regex?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
请注意,split 采用表示正则表达式的
String
,而不是Regex
,因此您不能从String
转换r
> 到正则表达式
。正则表达式的简要说明:
(?<=...)
是后向查找。它指出此匹配项之前必须带有模式...
,或者在您的情况下为\w
,这意味着您希望该模式跟随数字、字母或下划线。\b
表示字边界。它是单词字符(数字、字母和下划线)和非单词字符之间发生的零长度匹配,反之亦然。因为它是零长度,所以split
在分割时不会删除任何字符。(?!...)
是否定前瞻。在这里我常说我对从字母到破折号的单词边界不感兴趣。Note that split takes a
String
representing a regular expression, not aRegex
, so you must not convertr
fromString
toRegex
.Brief explanation on the regex:
(?<=...)
is a look-behind. It states that this match must be preceded by the pattern...
, or, in your case\w
, meaning you want the pattern to follow a digit, letter, or underline.\b
means word boundary. It is a zero-length match that happen between a word character (digits, letters and underscore) and a non-word character, or vice versa. Because it is zero-length,split
won't remove any character when splitting.(?!...)
is a negative-lookahead. Here I use to say that I'm not interested in word boundaries from a letter to a dash.从 Josh M 的回答开始,他有一个很好的正则表达式,但是由于
split
采用与“分隔符”匹配的正则表达式,因此您需要使用findAllIn
,如下所示:然后您得到结果
Starting with Josh M's answer, he has a good regular expression, but since
split
takes a regular expression matching the "delimiter", you need to usefindAllIn
as follows:Then you get the results
我不完全确定您在这里需要什么,但这应该有所帮助:
这意味着“术语”被定义为可选的点或散列,后跟一些单词字符。
你最终会得到:
I'm not completely sure what you need here but this should help:
It means a "term" is defined as an optional dot or hash followed by some word characters.
You will end up with: