正则表达式 - 保存重复捕获的组
这就是我正在做的
a = "%span.rockets#diamonds.ribbons.forever"
a = a.match(/(^\%\w+)([\.|\#]\w+)+/)
puts a.inspect
这就是我得到的
#<MatchData "%span.rockets#diamonds.ribbons.forever" 1:"%span" 2:".forever">
这就是我想要
#<MatchData "%span.rockets#diamonds.ribbons.forever" 1:"%span" 2:".rockets" 3:".#diamonds" 4:".ribbons" 5:".forever">
帮助?我尝试过但失败了:(
This is what I'm doing
a = "%span.rockets#diamonds.ribbons.forever"
a = a.match(/(^\%\w+)([\.|\#]\w+)+/)
puts a.inspect
This is what I get
#<MatchData "%span.rockets#diamonds.ribbons.forever" 1:"%span" 2:".forever">
This is what I want
#<MatchData "%span.rockets#diamonds.ribbons.forever" 1:"%span" 2:".rockets" 3:".#diamonds" 4:".ribbons" 5:".forever">
help? I tried and failed :(
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一般来说,您无法获得任意数量的捕获组,但如果您使用
scan
,您可以为您想要捕获的每个令牌获得匹配:这不是与您的正则表达式太不同,但我删除了最后一个标记的重复。
\G
不太为人所知 - 它告诉引擎匹配上一个匹配结束的位置,因此当匹配之间有额外字符时它不会中断(%span :P .火箭
)。一般来说,如果您的原始正则表达式有多个匹配项,则此方法可能会增加一些工作,因为您没有将组分隔为匹配项,但由于
match
返回单个结果,因此它应该可以正常工作。工作示例: http://ideone.com/nnmki
Generally, you can't get an arbitrary number of capturing groups, but if you use
scan
you can get a match for every token you want to capture:This isn't too different from your regex, but I removed repetition on the last token.
\G
isn't too well known - it tells the engine to match where the previous match ended, so it doesn't break when you have extra characters between matches (%span :P .rockets
).Generally, if you had multiple matches of your original regex this method may add some work, because you don't have the groups separated to matches, but since
match
returns a single result it should work fine.Working example: http://ideone.com/nnmki
这就是捕获组的工作原理。如果您想保存所有这些子字符串,请将量词放在捕获组中:
然后您的第二个捕获将是:
...您可以自己分解其余部分。
That's just how capturing groups work. If you want to save all of those substrings, put the quantifier inside the capturing group:
Then your second capture will be:
...and you can break it down the rest of the way yourself.