使用 boost::regex 获取子 match_results

发布于 2024-11-03 20:17:16 字数 494 浏览 7 评论 0原文

嘿，假设我有这个正则表达式： (test[0-9])+

并且我将其与： test1test2test3test0

const bool ret = boost::regex_search(input, what, r);

for (size_t i = 0; i < what.size(); ++i)
    cout << i << ':' << string(what[i]) << "\n";

现在，what[1]< /code> 将是 test0 （最后一次出现）。假设我还需要获取 test1、2 和 3：我该怎么办？

注意：真正的正则表达式极其复杂，并且必须保持整体匹配，因此将示例正则表达式更改为 (test[0-9]) 将不起作用。

原文

Hey, let's say I have this regex: (test[0-9])+

And that I match it against: test1test2test3test0

const bool ret = boost::regex_search(input, what, r);

for (size_t i = 0; i < what.size(); ++i)
    cout << i << ':' << string(what[i]) << "\n";

Now, what[1] will be test0 (the last occurrence). Let's say that I need to get test1, 2 and 3 as well: what should I do?

Note: the real regex is extremely more complex and has to remain one overall match, so changing the example regex to (test[0-9]) won't work.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

近箐 2024-11-10 20:17:16

我认为 Dot Net 有能力制作单个捕获组集合，以便 (grp)+ 将在 group1 上创建一个集合对象。 boost 引擎的 regex_search() 将与任何普通的匹配函数一样。您处于 while() 循环中，匹配上一次匹配结束的模式。您使用的表单不使用出价迭代器，因此该函数不会从上一场比赛结束的地方开始下一场比赛。

您可以使用迭代器形式：
（编辑 - 您还可以使用令牌迭代器，定义要迭代的组。添加到下面的代码中）。

#include <boost/regex.hpp> 
#include <string> 
#include <iostream> 

using namespace std;
using namespace boost;

int main() 
{ 
    string input = "test1 ,, test2,, test3,, test0,,";
    boost::regex r("(test[0-9])(?:$|[ ,]+)");
    boost::smatch what;

    std::string::const_iterator start = input.begin();
    std::string::const_iterator end   = input.end();

    while (boost::regex_search(start, end, what, r))
    {
        string stest(what[1].first, what[1].second);
        cout << stest << endl;
        // Update the beginning of the range to the character
        // following the whole match
        start = what[0].second;
    }

    // Alternate method using token iterator 
    const int subs[] = {1};  // we just want to see group 1
    boost::sregex_token_iterator i(input.begin(), input.end(), r, subs);
    boost::sregex_token_iterator j;
    while(i != j)
    {
       cout << *i++ << endl;
    }

    return 0;
}

输出：

test1
测试2
测试3
测试0

I think Dot Net has the ability to make single capture group Collections so that (grp)+ will create a collection object on group1. The boost engine's regex_search() is going to be just like any ordinary match function. You sit in a while() loop matching the pattern where the last match left off. The form you used does not use a bid-itterator, so the function won't start the next match where the last match left off.

You can use the itterator form:
(Edit - you can also use the token iterator, defining what groups to iterate over. Added in the code below).

#include <boost/regex.hpp> 
#include <string> 
#include <iostream> 

using namespace std;
using namespace boost;

int main() 
{ 
    string input = "test1 ,, test2,, test3,, test0,,";
    boost::regex r("(test[0-9])(?:$|[ ,]+)");
    boost::smatch what;

    std::string::const_iterator start = input.begin();
    std::string::const_iterator end   = input.end();

    while (boost::regex_search(start, end, what, r))
    {
        string stest(what[1].first, what[1].second);
        cout << stest << endl;
        // Update the beginning of the range to the character
        // following the whole match
        start = what[0].second;
    }

    // Alternate method using token iterator 
    const int subs[] = {1};  // we just want to see group 1
    boost::sregex_token_iterator i(input.begin(), input.end(), r, subs);
    boost::sregex_token_iterator j;
    while(i != j)
    {
       cout << *i++ << endl;
    }

    return 0;
}

Output:

test1
test2
test3
test0

回复收藏 0 原文