返回仅分组的正则比赛
我有一条很长的正则是包含括号内管道和包含在括号内的婴儿言论。我要提取的最后一个字段是SKU。但是,正则替换值返回两个匹配值。我只想返回第一场比赛。
这是正则:
((WEIGHT\s?\(KG\)[\n\s\D\d]*?(?'weightkg'\d+)\.?(?'weightdecimal'\d+))|(information\((?'sku'\d\d\D\D.+?\-?\d+?\D+?)\)))
这是源数据:
WEIGHT(KG)
Set (with Stand) 5.3Set (without Stand) 3.1
ACCESSORY
HDMI YesUSB Yes (type C)
COMPLIANCE INFORMATION
Dismantling information(24BL650C-B)
Dismantling information(24BL650C-BA)
EU Energy label 2019(24BL650C-B)
替换令牌$ sku
返回:
24BL650C-B24BL650C-BA
的第一个想法是在分组的正则延期末端添加一个懒惰的量词
((?'sku'\d\d\D\D.+?\-?\d+?\D+?)\))?
我 为了表明我想匹配组或空匹配,因此返回了大量的空值。
I've got a very long Regex which contains baby regexes which are piped and contained within parentheses. The very last field that I'm trying to extract is the SKU. However, the regex substitution value returns two matching values. I want to to return the first match only.
Here's the regex:
((WEIGHT\s?\(KG\)[\n\s\D\d]*?(?'weightkg'\d+)\.?(?'weightdecimal'\d+))|(information\((?'sku'\d\d\D\D.+?\-?\d+?\D+?)\)))
Here's the source data:
WEIGHT(KG)
Set (with Stand) 5.3Set (without Stand) 3.1
ACCESSORY
HDMI YesUSB Yes (type C)
COMPLIANCE INFORMATION
Dismantling information(24BL650C-B)
Dismantling information(24BL650C-BA)
EU Energy label 2019(24BL650C-B)
The substitution token $sku
returns:
24BL650C-B24BL650C-BA
My first idea was to add a lazy quantifier at the end of the grouped regex so that it looks like this:
((?'sku'\d\d\D\D.+?\-?\d+?\D+?)\))?
However, this appears to indicate that I want to match either the group or null and therefore returns tons of null values.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用
regex demo 。
详细信息:
(
- 组1开始:(stright \ s*\(kg \)[\ d \ d]*?(?'stogekg'\ d+)\。?(?'stogeDecimal'\ d+)) 2:
权重
,零或更多的空格,(kg)
,然后尽可能少的零或更多char,然后将一个或多个数字捕捉到“ stogekg”,然后一个可选的点,然后组捕获一个或多个数字组“ poytdecimal”
|
- 或(合规信息[^()]然后,除
以外的任何零或更多的字符(
and)
,然后(
,group'Sku'捕获除括号以外的任何零或更多chars,然后是)
char)
- 组1结束。You can use
See the regex demo.
Details:
(
- Group 1 start:(WEIGHT\s*\(KG\)[\D\d]*?(?'weightkg'\d+)\.?(?'weightdecimal'\d+))
- Group 2:WEIGHT
, zero or more whitespaces,(KG)
, then any zero or more chars as few as possible, then Group 'weightkg' capturing one or more digits, then an optional dot, then Group 'weightdecimal' capturing one or more digits|
- or(COMPLIANCE INFORMATION[^()]*\((?'sku'[^()]*)\))
- Group 5:COMPLIANCE INFORMATION
, then any zero or more chars other than(
and)
, then(
, Group 'sku' capturing any zero or more chars other than parentheses, and then a)
char)
- Group 1 end.