从国外来源设置 boost 正则表达式
我需要解析日志并且我有很好的正则表达式,但现在我需要从配置文件设置正则表达式,这是问题。
int logParser()
{
std::string bd_regex; // this reads from config in other part of program
boost::regex parsReg;
//("(C:.tmp.bd.*?)+(([a-zA-Z0-9_]+\\.)+[a-zA-Z]{2,4})+(.+[a-zA-Z0-9_])");
try
{
parsReg.assign(bd_regex, boost::regex_constants::icase);
}
catch (boost::regex_error& e)
{
cout << bd_regex << " is not a valid regular expression: \""
<< e.what() << "\"" << endl;
}
cout << parsReg << endl;
// here it looks exactly like:
// "("(C:.tmp.bd.*?)+(([a-zA-Z0-9_]+\\.)+[a-zA-Z]{2,4})+(.+[a-zA-Z0-9_])");"
int count=0;
ifstream in;
in.open(bd_log_path.c_str());
while (!in.eof())
{
in.getline(buf, BUFSIZE-1);
std::string s = buf;
boost::smatch m;
if (boost::regex_search(s, m, parsReg)) // it doesn't obey this "if"
{
std::string name, diagnosis;
name.assign(m[2]);
diagnosis.assign(m[4]);
strcpy(bd_scan_results[count].file_name, name.c_str());
strcpy(bd_scan_results[count].out, diagnosis.c_str());
strcat(bd_scan_results[count].out, " ");
count++;
}
}
return count;
}
我真的不知道为什么当我尝试从配置变量设置它时相同的正则表达式不起作用。
任何帮助将不胜感激(:
I need to parse log and I`ve good working regex, but now I need to set regex from config file and here is problem.
int logParser()
{
std::string bd_regex; // this reads from config in other part of program
boost::regex parsReg;
//("(C:.tmp.bd.*?)+(([a-zA-Z0-9_]+\\.)+[a-zA-Z]{2,4})+(.+[a-zA-Z0-9_])");
try
{
parsReg.assign(bd_regex, boost::regex_constants::icase);
}
catch (boost::regex_error& e)
{
cout << bd_regex << " is not a valid regular expression: \""
<< e.what() << "\"" << endl;
}
cout << parsReg << endl;
// here it looks exactly like:
// "("(C:.tmp.bd.*?)+(([a-zA-Z0-9_]+\\.)+[a-zA-Z]{2,4})+(.+[a-zA-Z0-9_])");"
int count=0;
ifstream in;
in.open(bd_log_path.c_str());
while (!in.eof())
{
in.getline(buf, BUFSIZE-1);
std::string s = buf;
boost::smatch m;
if (boost::regex_search(s, m, parsReg)) // it doesn't obey this "if"
{
std::string name, diagnosis;
name.assign(m[2]);
diagnosis.assign(m[4]);
strcpy(bd_scan_results[count].file_name, name.c_str());
strcpy(bd_scan_results[count].out, diagnosis.c_str());
strcat(bd_scan_results[count].out, " ");
count++;
}
}
return count;
}
and I really dont know why the same regex dont work when I tryed to set it from config variable.
Any help will be appreciated (:
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
关于您的直接问题:尝试在配置文件中存储不带转义的正则表达式
此外,我必须说,您似乎想在此处匹配反斜杠:
在配置中,写入:
在 C++ 字符串文字中那将是
On your direct question: Try storing the regex without escapes in the config file
Besides, I must say, that it looks like you wanted to match backslashes here:
In the config, write:
In a C++ string literal that would be
@sehe给出了正确的答案。
如果这行代码被c++解析器解析,
str = "(C:.tmp.bd.*?)+(([a-zA-Z0-9_]+\\.)+[a-zA-Z]{2,4})+( .+[a-zA-Z0-9_])";
它将转义字符
\\
转义为转义符:\
,然后将其分配给变量“str”。在变量“str”内部,现在看起来像这样:
<代码>(C:.tmp.bd.*?)+(([a-zA-Z0-9_]+\.)+[a-zA-Z]{2,4})+(.+[a -zA-Z0-9_])
但是,您正在从文件中读取此文本,没有语言意义上的解析。
您正在分配给“str”,即一行原始文本。未经 C++ 解析器预处理的行。
@sehe gives the correct answer.
If this line of code were parsed by the c++ parser,
str = "(C:.tmp.bd.*?)+(([a-zA-Z0-9_]+\\.)+[a-zA-Z]{2,4})+(.+[a-zA-Z0-9_])";
it would unescape the escape character
\\
into just an escape:\
, thenasign it to variable 'str'. Inside of the variable 'str', it now looks like this:
(C:.tmp.bd.*?)+(([a-zA-Z0-9_]+\.)+[a-zA-Z]{2,4})+(.+[a-zA-Z0-9_])
But, you are reading this text from a file, there is no parsing in a language sense.
You are asigning to 'str', a raw line of text. A line that is not pre-processed by the c++ parser.