XmlReader和MemoryStream,返回的xml缺少标签
有人可以向我解释这种行为吗?
如果您使用第一个字符串执行帖子底部的代码片段,它将返回与用于输入的字符串完全相同的字符串;这就是我所期望的。
输入 1:
<?xml version='1.0' encoding='UTF-8'?>
<Company>
<Creator>Me</Creator>
<CreationDateTime>2010-01-25T21:58:32.493</CreationDateTime>
<Contacts>
<Contact>
<ContactID>365</ContactID>
</Contact>
</Contacts>
</Company>
输出 1:
<?xml version='1.0' encoding='UTF-8'?>
<Company>
<Creator>Me</Creator>
<CreationDateTime>2010-01-25T21:58:32.493</CreationDateTime>
<Contacts>
<Contact>
<ContactID>365</ContactID>
</Contact>
</Contacts>
</Company>
现在,如果您使用第二行 (const string xml
),这与字符串完全相同,但在一行而不是两行上,则返回以下
输入 2
<?xml version='1.0' encoding='UTF-8'?>
<Company>
<Creator>Me</Creator>
<CreationDateTime>2010-01-25T21:58:32.493</CreationDateTime>
<Contacts>
<Contact>
<ContactID>365</ContactID>
</Contact>
</Contacts>
</Company>
输出 2
<?xml version='1.0' encoding='UTF-8'?>
<Creator>Me</Creator>2010-01-25T21:58:32.493
<Contacts>
<Contact>
<ContactID>365</ContactID>
</Contact>
</Contacts>
唯一的区别两者之间的区别是,第一个输出在 xml 声明之后有一个换行符,但正如您所看到的,第二个输出缺少父标记和第三个标记。有什么想法吗?
这是我使用的代码:
public void XmlReader_Eats_Tags_IsTrue()
{
//this first xml declaration is on two lines - line break is right after the xml declaration (I am not sure how to add the line break using the markdown, so if you execute the code on your machine, please add it)
const string xml = @"<?xml version='1.0' encoding='UTF-8'?><Company><Creator>Me</Creator><CreationDateTime>2010-01-25T21:58:32.493</CreationDateTime><Contacts><Contact><ContactID>365</ContactID></Contact></Contacts></Company>";
//The seconde xml declaration is on one line
//const string xml = @"<?xml version='1.0' encoding='UTF-8'?><Company><Creator>Me</Creator><CreationDateTime>2010-01-25T21:58:32.493</CreationDateTime><Contacts><Contact><ContactID>365</ContactID></Contact></Contacts></Company>";
BufferedStream stream = new BufferedStream(new MemoryStream());
stream.Write(Encoding.ASCII.GetBytes(xml), 0, xml.Length);
stream.Seek(0, SeekOrigin.Begin);
StreamReader streamReaderXml = new StreamReader(stream);
XmlReader xmlR = XmlReader.Create(streamReaderXml);
XmlReaderSettings xmlReaderset =
new XmlReaderSettings{ValidationType = ValidationType.Schema};
xmlReaderset.Schemas.ValidationEventHandler += ValidationCallBack;
MemoryStream ms = new MemoryStream();
XmlWriterSettings xmlWriterSettings =
new XmlWriterSettings{
Encoding = new UTF8Encoding(false),
ConformanceLevel = ConformanceLevel.Fragment
};
using (XmlWriter xmlTw = XmlWriter.Create(ms, xmlWriterSettings))
{
using (XmlReader xmlRead = XmlReader.Create(xmlR, xmlReaderset))
{
int i = 0;
while (xmlRead.Read())
{
Console.WriteLine("{0}:{1}; node type: {2}", i, xmlRead.Name, xmlRead.NodeType);
// Reads the whole file and will call the validation handler subroutine if an error is detected.
xmlTw.WriteNode(xmlRead, true);
i++;
}
xmlTw.Flush();
xmlRead.Close();
}
string xmlString = Encoding.UTF8.GetString(ms.ToArray());
Console.WriteLine(xmlString);
}
}
Could someone explain this behaviour to me?
If you execute the snippet at the bottom of the post with the first string, it returns the exact same string as the one used for the input; that's what I expected.
input 1:
<?xml version='1.0' encoding='UTF-8'?>
<Company>
<Creator>Me</Creator>
<CreationDateTime>2010-01-25T21:58:32.493</CreationDateTime>
<Contacts>
<Contact>
<ContactID>365</ContactID>
</Contact>
</Contacts>
</Company>
output 1:
<?xml version='1.0' encoding='UTF-8'?>
<Company>
<Creator>Me</Creator>
<CreationDateTime>2010-01-25T21:58:32.493</CreationDateTime>
<Contacts>
<Contact>
<ContactID>365</ContactID>
</Contact>
</Contacts>
</Company>
Now if you use the second line (const string xml
), which is exaclty the same string but on one line instead of two it returns the following
intput 2
<?xml version='1.0' encoding='UTF-8'?>
<Company>
<Creator>Me</Creator>
<CreationDateTime>2010-01-25T21:58:32.493</CreationDateTime>
<Contacts>
<Contact>
<ContactID>365</ContactID>
</Contact>
</Contacts>
</Company>
output 2
<?xml version='1.0' encoding='UTF-8'?>
<Creator>Me</Creator>2010-01-25T21:58:32.493
<Contacts>
<Contact>
<ContactID>365</ContactID>
</Contact>
</Contacts>
The only difference between the 2 is that the first one has a line break right after the xml declaration but as you can see the second output misses the Parent tag and the third tag. Any thought?
Here is the code I used:
public void XmlReader_Eats_Tags_IsTrue()
{
//this first xml declaration is on two lines - line break is right after the xml declaration (I am not sure how to add the line break using the markdown, so if you execute the code on your machine, please add it)
const string xml = @"<?xml version='1.0' encoding='UTF-8'?><Company><Creator>Me</Creator><CreationDateTime>2010-01-25T21:58:32.493</CreationDateTime><Contacts><Contact><ContactID>365</ContactID></Contact></Contacts></Company>";
//The seconde xml declaration is on one line
//const string xml = @"<?xml version='1.0' encoding='UTF-8'?><Company><Creator>Me</Creator><CreationDateTime>2010-01-25T21:58:32.493</CreationDateTime><Contacts><Contact><ContactID>365</ContactID></Contact></Contacts></Company>";
BufferedStream stream = new BufferedStream(new MemoryStream());
stream.Write(Encoding.ASCII.GetBytes(xml), 0, xml.Length);
stream.Seek(0, SeekOrigin.Begin);
StreamReader streamReaderXml = new StreamReader(stream);
XmlReader xmlR = XmlReader.Create(streamReaderXml);
XmlReaderSettings xmlReaderset =
new XmlReaderSettings{ValidationType = ValidationType.Schema};
xmlReaderset.Schemas.ValidationEventHandler += ValidationCallBack;
MemoryStream ms = new MemoryStream();
XmlWriterSettings xmlWriterSettings =
new XmlWriterSettings{
Encoding = new UTF8Encoding(false),
ConformanceLevel = ConformanceLevel.Fragment
};
using (XmlWriter xmlTw = XmlWriter.Create(ms, xmlWriterSettings))
{
using (XmlReader xmlRead = XmlReader.Create(xmlR, xmlReaderset))
{
int i = 0;
while (xmlRead.Read())
{
Console.WriteLine("{0}:{1}; node type: {2}", i, xmlRead.Name, xmlRead.NodeType);
// Reads the whole file and will call the validation handler subroutine if an error is detected.
xmlTw.WriteNode(xmlRead, true);
i++;
}
xmlTw.Flush();
xmlRead.Close();
}
string xmlString = Encoding.UTF8.GetString(ms.ToArray());
Console.WriteLine(xmlString);
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
问题是您正在使用
XmlWriter.WriteNode(reader, true)
并调用XmlReader.Read()
。WriteNode
已经将读取器移动到同级元素上,因此当您再次调用Read
时,您实际上会跳过数据。我怀疑它碰巧在第一个版本中工作,因为您在第二次调用
Read
时跳过了空格,然后在第二个版本中读取文档的其余部分调用WriteNode
。The problem is that you're using
XmlWriter.WriteNode(reader, true)
and callingXmlReader.Read()
.WriteNode
already moves the reader onto the sibling element, so you're effectively skipping over data when you then callRead
again.I suspect it happens to be working in the first version because you're skipping over whitespace in the second call to
Read
, and then reading the rest of the document in the second call toWriteNode
.