url 属性的rapidxml 解析错误
在解析 xml 文件时,我遇到了 Rapidxml 的奇怪错误,例如
<?xml version="1.0" encoding="UTF-8"?>
<IMG align="left"
src="http://www.w3.org/Icons/WWW/w3c_home" />
它抛出“expected >”。 我使用如下代码来解析数据,
std::fstream file("./test.xml");
std::istream_iterator<char> eos;
std::istream_iterator<char> iit (file);
std::vector<char> xml(iit, eos);
xml.push_back('\0');
xml_document<> doc;
doc.parse<0>(&xml[0]);
IMG 抹布中的“/”符号似乎不是问题。这是一个rapidxml错误还是我做错了什么?
I'm getting a strange error with rapidxml when parsing a xml file like
<?xml version="1.0" encoding="UTF-8"?>
<IMG align="left"
src="http://www.w3.org/Icons/WWW/w3c_home" />
It throws "expected >".
Im using a code like the following to parse the data
std::fstream file("./test.xml");
std::istream_iterator<char> eos;
std::istream_iterator<char> iit (file);
std::vector<char> xml(iit, eos);
xml.push_back('\0');
xml_document<> doc;
doc.parse<0>(&xml[0]);
the "/" symbol in the IMG rag seems t be the problem. Is this a rapidxml bug or am I doing something wrong?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
将 XML 数据加载到向量中的方式是错误的。在 C++ 文本模式下,流默认设置有“skipws”标志,这会导致它们跳过输入中的所有空白。您可以通过检查向量的内容来验证这一点 - 它将丢失所有空格/结束线。这显然会导致解析器抱怨。
取消设置流上的skipws 标志以获得正确的行为:
或者,您可以使用rapidxml_utils.hpp 中的文件类来加载文件:
遗憾的是,使用C++ 流加载文本文件非常棘手且充满陷阱。
至于上面的 sehe 测试,“错误接受”的情况是有意设计的(我没有足够的声誉来为他的答案添加评论)。您需要使用“parse_validate_ending_tags”解析标志来使解析器检查结束标记名称是否与起始标记名称匹配:
请参阅 parse_validate_ending_tags 。
此行为的基本原理是性能 - 验证结束标记非常耗时,并且在大多数情况下不需要。
The way you load the XML data into vector is wrong. In C++ text mode streams have "skipws" flag set by default, which causes them to skip all whitespace in the input. You can verify this by examining the contents of your vector - it will have all spaces/endlines missing. This obviously causes the parser to complain.
Unset skipws flag on the stream to get the correct behaviour:
Alternatively, you can use file class from rapidxml_utils.hpp to load the file:
Sadly, loading text files with C++ streams is very tricky and full of traps.
As for sehe tests above, the "incorrectly accepted" cases are by design (I don't have enough reputation to add comments to his answer). You need to use "parse_validate_closing_tags" parse flag to make the parser check whether end tag name matches starting tag name:
See parse_validate_closing_tags in rapidxml manual.
The rationale for this behaviour is performance - verifying end tags is time consuming and in most cases not needed.
我只是出于好奇而尝试了一下。 RapidXml 可能很快,但它肯定不是很好
调用它会导致各种有趣的事情:
正确接受:
正确拒绝
错误接受:
YMMV
I just tried it out of curiosity. RapidXml might be fast, but it sure isn't very good
Invoking it results in all kinds of funny business:
Correctly accepted:
Correctly rejected
Incorrectly accepted:
YMMV
您的 XML 有效。如果代码和 XML 与您发布的完全相同,那么它一定是一个rapidxml bug。我猜它要么不支持在多行之间打破属性列表,要么不太可能不支持标签结尾的
/>
。Your XML is valid. If the code and the XML are exactly as you posted, it must be a rapidxml bug. I guess it either doesn't support breaking attribute list among multiple lines, or less likely, doesn't support
/>
for end of tag.