使用 boost::regex 获取子 match_results
嘿,假设我有这个正则表达式: (test[0-9])+
并且我将其与: test1test2test3test0
const bool ret = boost::regex_search(input, what, r);
for (size_t i = 0; i < what.size(); ++i)
cout << i << ':' << string(what[i]) << "\n";
现在,what[1]< /code> 将是
test0
(最后一次出现)。假设我还需要获取 test1
、2 和 3:我该怎么办?
注意:真正的正则表达式极其复杂,并且必须保持整体匹配,因此将示例正则表达式更改为 (test[0-9])
将不起作用。
Hey, let's say I have this regex: (test[0-9])+
And that I match it against: test1test2test3test0
const bool ret = boost::regex_search(input, what, r);
for (size_t i = 0; i < what.size(); ++i)
cout << i << ':' << string(what[i]) << "\n";
Now, what[1]
will be test0
(the last occurrence). Let's say that I need to get test1
, 2 and 3 as well: what should I do?
Note: the real regex is extremely more complex and has to remain one overall match, so changing the example regex to (test[0-9])
won't work.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我认为 Dot Net 有能力制作单个捕获组集合,以便 (grp)+ 将在 group1 上创建一个集合对象。 boost 引擎的 regex_search() 将与任何普通的匹配函数一样。您处于 while() 循环中,匹配上一次匹配结束的模式。您使用的表单不使用出价迭代器,因此该函数不会从上一场比赛结束的地方开始下一场比赛。
您可以使用迭代器形式:
(编辑 - 您还可以使用令牌迭代器,定义要迭代的组。添加到下面的代码中)。
输出:
test1
测试2
测试3
测试0
I think Dot Net has the ability to make single capture group Collections so that (grp)+ will create a collection object on group1. The boost engine's regex_search() is going to be just like any ordinary match function. You sit in a while() loop matching the pattern where the last match left off. The form you used does not use a bid-itterator, so the function won't start the next match where the last match left off.
You can use the itterator form:
(Edit - you can also use the token iterator, defining what groups to iterate over. Added in the code below).
Output:
test1
test2
test3
test0
Boost.Regex 为该功能提供了实验性支持(称为重复捕获);但是,由于它对性能造成巨大影响,因此默认情况下禁用此功能。
要启用重复捕获,您需要重建Boost.Regex并在所有翻译单元中定义宏
BOOST_REGEX_MATCH_EXTRA
;最好的方法是在 boost/regex/user.hpp 中取消注释此定义(请参阅 参考,它位于页面的最底部)。使用此定义进行编译后,您可以通过调用/使用
regex_search
、regex_match
和regex_iterator
以及match_extra
来使用此功能旗帜。检查对 Boost.Regex 的引用更多信息。
Boost.Regex offers experimental support for exactly this feature (called repeated captures); however, since it's huge performance hit, this feature is disabled by default.
To enable repeated captures, you need to rebuild Boost.Regex and define macro
BOOST_REGEX_MATCH_EXTRA
in all translation units; the best way to do this is to uncomment this define in boost/regex/user.hpp (see the reference, it's at the very bottom of the page).Once compiled with this define, you can use this feature by calling/using
regex_search
,regex_match
andregex_iterator
withmatch_extra
flag.Check reference to Boost.Regex for more info.
在我看来,您需要创建一个 regex_iterator,使用
(test[0-9])
正则表达式作为输入。然后,您可以使用生成的 regex_iterator 来枚举原始目标的匹配子字符串。如果您仍然需要“整体匹配”,那么这项工作可能必须与查找匹配子字符串的任务分离。您能澄清一下您的要求的那部分吗?
Seems to me like you need to create a regex_iterator, using the
(test[0-9])
regex as input. Then you can use the resultingregex_iterator
to enumerate the matching substrings of your original target.If you still need "one overall match" then perhaps that work has to be decoupled from the task of finding matching substrings. Can you clarify that part of your requirement?