tr1::regex regex_search 问题
我正在使用 tr1::regex 尝试从字符串中提取一些匹配项。一个示例字符串可能是
asdf werq "one two three" asdf
我想摆脱这个:
asdf
werq
one two three
asdf
将引号中的内容分组在一起,所以我尝试使用正则表达式 \"(.+?)\"|([^\\s ]+)
。我使用的代码是:
cmatch res;
regex reg("\"(.+?)\"|([^\\s]+)", regex_constants::icase);
regex_search("asdf werq \"one two three\" asdf", res, reg);
cout << res.size() << endl;
for (unsigned int i = 0; i < res.size(); ++k) {
cout << res[i] << endl;
}
但输出
3
asdf
asdf
我做错了什么?
I'm using tr1::regex to try to extract some matches from a string. An example string could be
asdf werq "one two three" asdf
And I would want to get out of that:
asdf
werq
one two three
asdf
With stuff in quotes grouped together, so I'm trying to use the regex \"(.+?)\"|([^\\s]+)
. The code I'm using is:
cmatch res;
regex reg("\"(.+?)\"|([^\\s]+)", regex_constants::icase);
regex_search("asdf werq \"one two three\" asdf", res, reg);
cout << res.size() << endl;
for (unsigned int i = 0; i < res.size(); ++k) {
cout << res[i] << endl;
}
but that outputs
3
asdf
asdf
What am I doing wrong?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您的正则表达式引擎似乎不支持后向断言。为了避免使用后向查找,您可以尝试以下操作:
或引用:
此正则表达式将起作用,但每个匹配项将有两个捕获,其中一个为空(第一个 - 如果是未引用的单词,或者第二个——如果是带引号的字符串)。
为了能够使用它,您需要迭代所有匹配,并为每个匹配使用非空捕获。
我对TR1了解不够,所以我不知道如何迭代所有匹配。但如果我没记错的话,
res.size()
将始终等于 3。例如,对于字符串
asdf "one 二三" werq
第一个比赛将是:第二场比赛将是:
第三场比赛将是:
HTH。
It appears that your regex engine does not support lookbehind assertions. To avoid using lookbehinds, you can try the following:
or quoted:
This regex will work, but each match will have two captures, one of which will be empty (either the first -- in case of a non-quoted word, or the second -- in case of a quoted string).
To be able to use this you need to iterate over all matches, and for each match use the non-empty capture.
I don't know enough about TR1, so I don't know exactly how one iterates over all matches. But if I'm not mistaken, the
res.size()
will be always equal to 3.For example, for the string
asdf "one two three" werq
the first match will be:The second match will be:
and the third match will be:
HTH.
您可能想尝试以下正则表达式:
当引用时,它当然需要转义:
顺便说一句,您使用的代码可能仅匹配目标字符串中的第一个单词,因为它不使用 match_any。您在结果中获得的 3 项可能是 (1) 整个匹配,(2) 第一个捕获——它是空的,以及 (3) 第二个捕获,它是匹配的源。
You may want to try the following regex instead:
When quoted, it of course needs to be escaped:
Btw, the code you used probably matches only the first word in the target string, since it does not use match_any. The 3 items you are getting in the result are probably (1) the entire match, (2) the first capture -- which is empty, and (3) the second capture, which is the source of the match.