boost 正则表达式分词器和换行符
我目前正在尝试在遇到换行符时将文本文件拆分为字符串向量。以前我曾使用 boost tokenizer 对其他分隔符执行此操作,但是当我使用换行符 '\n' 时,它会在运行时引发异常:
terminate called after throwing an instance of 'boost::escaped_list_error'
what(): unknown escape sequence
Aborted
这是代码:
std::vector<std::string> parse_lines(const std::string& input_str){
using namespace boost;
std::vector<std::string> parsed;
tokenizer<escaped_list_separator<char> > tk(input_str, escaped_list_separator<char>('\n'));
for (tokenizer<escaped_list_separator<char> >::iterator i(tk3.begin());
i != tk.end(); ++i)
{
parsed.push_back(*i);
}
return parsed;
}
非常感谢任何建议!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
escaped_list_separator
的构造函数需要转义字符,然后是分隔符,然后是引号字符。通过使用换行符作为转义字符,它将输入中每一行的第一个字符视为转义序列的一部分。试试这个吧。escaped_list_separator('\\', '\n')
http://www.boost.org/doc/libs/1_46_1/libs/tokenizer/escaped_list_separator.htm
escaped_list_separator
's constructor expects the escape character, then the delimiter character, then the quote character. By using a newline as your escape character, its treating the first character in every line in your input as part of an escape sequence. Try this instead.escaped_list_separator<char>('\\', '\n')
http://www.boost.org/doc/libs/1_46_1/libs/tokenizer/escaped_list_separator.htm
鉴于标准库已经直接支持您想要的分隔符,我想我会完全跳过使用正则表达式,并使用标准库中已经存在的内容:
一旦您通过将字符串视为从那里流式传输和读取行,关于如何从那里开始的详细信息,您有很多选择。仅举几个例子,您可能想要使用 @UncleBens 和我在响应 上一个问题。
Given that the separator you want is already supported directly by the standard library, I think I'd skip using regexes for this at all, and use what's already present in the standard library:
Once you deal with the problem by treating the string as a stream and read lines from there, you have quite a few options about the details of how you go from there. Just for a couple of examples, you might want to use use the line proxy and/or LineInputIterator classes that @UncleBens and I posted in response to a previous question.
这可能效果更好。
编辑:或者,您可以使用 std::find 并制作自己的分割器循环。
This might work better.
Edit: Alternately you can use
std::find
and make your own splitter loop.