RegEx 使用 RegExp.exec 从字符串中提取所有匹配项
我正在尝试解析以下类型的字符串:
[key:"val" key2:"val2"]
其中有任意 key:"val" 对。我想获取键名和值。 对于那些好奇的人,我正在尝试解析任务战士的数据库格式。
这是我的测试字符串:
[description:"aoeu" uuid:"123sth"]
它旨在强调除了空格之外的键或值中可以包含任何内容,冒号周围没有空格,并且值始终用双引号引起来。
在节点中,这是我的输出:
[deuteronomy][gatlin][~]$ node
> var re = /^\[(?:(.+?):"(.+?)"\s*)+\]$/g
> re.exec('[description:"aoeu" uuid:"123sth"]');
[ '[description:"aoeu" uuid:"123sth"]',
'uuid',
'123sth',
index: 0,
input: '[description:"aoeu" uuid:"123sth"]' ]
但是 description:"aoeu"
也匹配此模式。我怎样才能取回所有匹配项?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(19)
继续循环调用
re.exec(s)
来获取所有匹配项:使用此 JSFiddle 尝试:https://jsfiddle.net/7yS2V/
Continue calling
re.exec(s)
in a loop to obtain all the matches:Try it with this JSFiddle: https://jsfiddle.net/7yS2V/
str.match(pattern)
,如果pattern
具有全局标志g
,将以数组形式返回所有匹配项。例如:
str.match(pattern)
, ifpattern
has the global flagg
, will return all the matches as an array.For example:
要循环遍历所有匹配项,您可以使用
replace
函数:To loop through all matches, you can use the
replace
function:这是一个解决方案
这是基于 Lawnsea 的答案,但更短。
请注意,必须设置“g”标志才能在调用之间向前移动内部指针。
This is a solution
This is based on lawnsea's answer, but shorter.
Notice that the `g' flag must be set to move the internal pointer forward across invocations.
以数组形式返回所有匹配项。
如果出于某种神秘的原因,您需要 exec 附带的附加信息,作为先前答案的替代方案,您可以使用递归函数而不是循环来实现,如下所示(这看起来也更酷: )。
正如之前的评论中所述,在正则表达式定义末尾添加
g
非常重要,这样可以在每次执行时向前移动指针。returns all matches as an array.
If, for some mysterious reason, you need the additional information comes with
exec
, as an alternative to previous answers, you could do it with a recursive function instead of a loop as follows (which also looks cooler :).as stated in the comments before, it's important to have
g
at the end of regex definition to move the pointer forward in each execution.我们终于开始看到内置的
matchAll
函数,请参阅 此处查看说明和兼容性表。截至 2020 年 5 月,似乎支持 Chrome、Edge、Firefox 和 Node.js (12+),但不支持 IE、Safari 和 Opera。似乎是 起草于 2018 年 12 月,所以给到达所有浏览器还需要一些时间,但我相信它会到达那里。内置的
matchAll
函数很好,因为它返回一个 可迭代。它还会返回每场比赛的捕获组!所以你可以做类似的事情似乎每个匹配对象都使用与
match()
。因此,每个对象都是匹配组和捕获组的数组,以及三个附加属性index
、input
和groups
。所以它看起来像:有关
matchAll
的更多信息,还有一个 Google 开发者页面。还有 polyfills/shims 可用。We are finally beginning to see a built-in
matchAll
function, see here for the description and compatibility table. It looks like as of May 2020, Chrome, Edge, Firefox, and Node.js (12+) are supported but not IE, Safari, and Opera. Seems like it was drafted in December 2018 so give it some time to reach all browsers, but I trust it will get there.The built-in
matchAll
function is nice because it returns an iterable. It also returns capturing groups for every match! So you can do things likeIt also seem like every match object uses the same format as
match()
. So each object is an array of the match and capturing groups, along with three additional propertiesindex
,input
, andgroups
. So it looks like:For more information about
matchAll
there is also a Google developers page. There are also polyfills/shims available.如果您有 ES9
(意味着您的系统:Chrome、Node.js、Deno、Bun、Firefox 等支持 Ecmascript 2019 或更高版本)
MDN 文档
如果你使用NPM
你可以使用官方的polyfill
npm install string.prototype.matchall
否则,
这是一些功能相似的复制粘贴版本
示例用法:
输出:
If you have ES9
(Meaning if your system: Chrome, Node.js, Deno, Bun, Firefox, etc supports Ecmascript 2019 or later)
MDN Documentation
If you use NPM
You can use the official polyfill
npm install string.prototype.matchall
Otherwise
Here's some functionally similar copy-paste versions
example usage:
outputs:
基于 Agus 的函数,但我更喜欢只返回匹配值:
Based on Agus's function, but I prefer return just the match values:
可迭代更好:
在循环中使用:
或者如果你想要一个数组:
Iterables are nicer:
Usage in a loop:
Or if you want an array:
这是我获取匹配的函数:
Here is my function to get the matches :
如果您能够使用
matchAll
这里有一个技巧:Array.From
有一个“选择器”参数,这样您就不会以一组尴尬的“匹配”结果结束可以将其投影到您真正需要的内容:如果您有命名组,例如。 (
/(?[az][AZ]+)/g
) 你可以这样做:If you're able to use
matchAll
here's a trick:Array.From
has a 'selector' parameter so instead of ending up with an array of awkward 'match' results you can project it to what you really need:If you have named groups eg. (
/(?<firstname>[a-z][A-Z]+)/g
) you could do this:自 ES9 以来,现在有一种更简单、更好的方法来获取所有匹配项,以及有关捕获组及其索引的信息:
目前在 Chrome、Firefox、Opera 中受支持。根据您阅读本文的时间,请检查 此链接 查看其当前支持。
Since ES9, there's now a simpler, better way of getting all the matches, together with information about the capture groups, and their index:
It is currently supported in Chrome, Firefox, Opera. Depending on when you read this, check this link to see its current support.
使用这个...
它将返回所有匹配的数组...那就可以了...
但请记住,它不会考虑组......它只会返回完整的比赛......
Use this...
It will return an array of all matches...That would work just fine....
But remember it won't take groups in account..It will just return the full matches...
我绝对推荐使用 String.match() 函数,并为其创建一个相关的正则表达式。我的示例是一个字符串列表,在扫描用户输入的关键字和短语时,这通常是必需的。
希望这有帮助!
I would definatly recommend using the String.match() function, and creating a relevant RegEx for it. My example is with a list of strings, which is often necessary when scanning user inputs for keywords and phrases.
Hope this helps!
这并不能真正帮助解决您更复杂的问题,但无论如何我都会发布此内容,因为对于不像您那样进行全局搜索的人来说,这是一个简单的解决方案。
我已经简化了答案中的正则表达式以使其更加清晰(这不是您确切问题的解决方案)。
由于有注释,这看起来比实际情况更详细,这就是没有注释的情况。
请注意,任何不匹配的组都将在数组中作为
未定义
值列出。该解决方案使用 ES6 扩展运算符来净化正则表达式特定值的数组。如果您想要 IE11 支持,您需要通过 Babel 运行代码。
This isn't really going to help with your more complex issue but I'm posting this anyway because it is a simple solution for people that aren't doing a global search like you are.
I've simplified the regex in the answer to be clearer (this is not a solution to your exact problem).
That looks more verbose than it is because of the comments, this is what it looks like without comments
Note that any groups that do not match will be listed in the array as
undefined
values.This solution uses the ES6 spread operator to purify the array of regex specific values. You will need to run your code through Babel if you want IE11 support.
这是一个没有 while 循环的单行解决方案。
该顺序保留在结果列表中。
潜在的缺点是
Here's a one line solution without a while loop.
The order is preserved in the resulting list.
The potential downsides are
我的猜测是,如果存在额外或缺失空格等边缘情况,那么边界较少的表达式也可能是一种选择:
测试
正则表达式电路
jex.im 可视化正则表达式:
My guess is that if there would be edge cases such as extra or missing spaces, this expression with less boundaries might also be an option:
Test
RegEx Circuit
jex.im visualizes regular expressions:
Basically, this is ES6 way to convert Iterator returned by exec to a regular Array
Basically, this is ES6 way to convert Iterator returned by exec to a regular Array
这是我的回答:
Here is my answer: