C# 从文件序列化数据契约

发布于 2024-12-09 11:52:20 字数 1211 浏览 5 评论 0原文

我有一个 Xml 消息列表，特别是我记录到文件中的 DataContract 消息。我正在尝试将它们从文件中一一反序列化。我不想立即将整个文件读入内存，因为我预计它会很大。

我有这个序列化的实现并且有效。我通过使用 FileStream 进行序列化并读取字节并使用正则表达式来确定元素的结尾来完成此操作。然后获取该元素并使用 DataContractSerializer 来获取实际对象。

但我被告知我应该使用更高级别的代码来完成这项任务，看起来这应该是可能的。我有以下代码，我认为应该可以工作，但事实并非如此。

FileStream readStream = File.OpenRead(filename);
DataContractSerializer ds = new DataContractSerializer(typeof(MessageType));
MessageType msg;
while ((msg = (MessageType)ds.ReadObject(readStream)) != null)
{
    Console.WriteLine("Test " + msg.Property1);
}

上面的代码提供了一个包含以下内容的输入文件：

<MessageType>....</MessageType>
<MessageType>....</MessageType>
<MessageType>....</MessageType>

看来我可以正确读取并反序列化第一个元素，但之后它失败说：

System.Runtime.Serialization.SerializationException was unhandled
  Message=There was an error deserializing the object of type MessageType. The data at the root level is invalid. Line 1, position 1.
  Source=System.Runtime.Serialization

我在某处读到，这是由于 DataContractSerializer 与填充的工作方式有关'\0' 到最后 - 但我无法弄清楚如何在从流读取时解决这个问题，而没有以其他方式弄清楚 MessageType 标记的结尾。我应该使用另一个序列化类吗？或者也许有解决这个问题的方法？

谢谢！

原文

I have a list of Xml messages specifically DataContract messages that i record to a file. And i am trying to deserialize them from file one by one. I do not want to read the whole file into memory at once because i expect it to be very big.

I have an implementation of this serialization and that works. I did this by serializing using a FileStream and reading the bytes and using regular expression to determine the end of element. Then taking the element and using DataContractSerializer to get the actual object.

But i was told I should be using higher level code to do this task and it seems like that should be possible. I have the following code that i think should work but it doesn't.

FileStream readStream = File.OpenRead(filename);
DataContractSerializer ds = new DataContractSerializer(typeof(MessageType));
MessageType msg;
while ((msg = (MessageType)ds.ReadObject(readStream)) != null)
{
    Console.WriteLine("Test " + msg.Property1);
}

The above code is fed with an input file containing something along the following lines:

<MessageType>....</MessageType>
<MessageType>....</MessageType>
<MessageType>....</MessageType>

It appears that i can read and deserialize the first element correctly but after that it fails saying:

System.Runtime.Serialization.SerializationException was unhandled
  Message=There was an error deserializing the object of type MessageType. The data at the root level is invalid. Line 1, position 1.
  Source=System.Runtime.Serialization

I have read somewhere that it is due to the way DataContractSerializer works with padded '\0''s to the end - but i couldn't figure out how to fix this problem when reading from a stream without figuring out the end of MessageType tag in some other way. Is there another Serialization class that i should be using? or perhaps a way around this problem?

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

无法回应 2024-12-16 11:52:20

当您从文件中反序列化数据时，WCF 默认情况下使用只能使用正确的 XML 文档的读取器。您正在阅读的文档不是 - 它包含多个根元素，因此它实际上是一个片段。您可以通过使用 ReadObject 的另一种重载（如下例所示）将序列化程序使用的读取器更改为接受片段的读取器（通过使用 XmlReaderSettings 对象）。或者，您可以在元素周围放置某种包装元素，然后您将一直阅读，直到阅读器位于包装器的末尾元素处。

public class StackOverflow_7760551
{
    [DataContract]
    public class Person
    {
        [DataMember]
        public string Name { get; set; }
        [DataMember]
        public int Age { get; set; }

        public override string ToString()
        {
            return string.Format("Person[Name={0},Age={1}]", this.Name, this.Age);
        }
    }

    public static void Test()
    {
        const string fileName = "test.xml";
        using (FileStream fs = File.Create(fileName))
        {
            Person[] people = new Person[]
            { 
                new Person { Name = "John", Age = 33 },
                new Person { Name = "Jane", Age = 28 },
                new Person { Name = "Jack", Age = 23 }
            };

            foreach (Person p in people)
            {
                XmlWriterSettings ws = new XmlWriterSettings
                {
                    Indent = true,
                    IndentChars = "  ",
                    OmitXmlDeclaration = true,
                    Encoding = new UTF8Encoding(false),
                    CloseOutput = false,
                };
                using (XmlWriter w = XmlWriter.Create(fs, ws))
                {
                    DataContractSerializer dcs = new DataContractSerializer(typeof(Person));
                    dcs.WriteObject(w, p);
                }
            }
        }

        Console.WriteLine(File.ReadAllText(fileName));

        using (FileStream fs = File.OpenRead(fileName))
        {
            XmlReaderSettings rs = new XmlReaderSettings
            {
                ConformanceLevel = ConformanceLevel.Fragment,
            };
            XmlReader r = XmlReader.Create(fs, rs);
            while (!r.EOF)
            {
                Person p = new DataContractSerializer(typeof(Person)).ReadObject(r) as Person;
                Console.WriteLine(p);
            }
        }

        File.Delete(fileName);
    }
}

When you're deserializing the data from the file, WCF uses by default a reader which can only consume proper XML documents. The document which you're reading isn't - it contains multiple root elements, so it's effectively a fragment. You can change the reader the serializer is using by using another overload of ReadObject, as shown in the example below, to one which accepts fragments (by using the XmlReaderSettings object). Or you can have some sort of wrapping element around the <MessageType> elements, and you'd read until the reader were positioned at the end element for the wrapper.

public class StackOverflow_7760551
{
    [DataContract]
    public class Person
    {
        [DataMember]
        public string Name { get; set; }
        [DataMember]
        public int Age { get; set; }

        public override string ToString()
        {
            return string.Format("Person[Name={0},Age={1}]", this.Name, this.Age);
        }
    }

    public static void Test()
    {
        const string fileName = "test.xml";
        using (FileStream fs = File.Create(fileName))
        {
            Person[] people = new Person[]
            { 
                new Person { Name = "John", Age = 33 },
                new Person { Name = "Jane", Age = 28 },
                new Person { Name = "Jack", Age = 23 }
            };

            foreach (Person p in people)
            {
                XmlWriterSettings ws = new XmlWriterSettings
                {
                    Indent = true,
                    IndentChars = "  ",
                    OmitXmlDeclaration = true,
                    Encoding = new UTF8Encoding(false),
                    CloseOutput = false,
                };
                using (XmlWriter w = XmlWriter.Create(fs, ws))
                {
                    DataContractSerializer dcs = new DataContractSerializer(typeof(Person));
                    dcs.WriteObject(w, p);
                }
            }
        }

        Console.WriteLine(File.ReadAllText(fileName));

        using (FileStream fs = File.OpenRead(fileName))
        {
            XmlReaderSettings rs = new XmlReaderSettings
            {
                ConformanceLevel = ConformanceLevel.Fragment,
            };
            XmlReader r = XmlReader.Create(fs, rs);
            while (!r.EOF)
            {
                Person p = new DataContractSerializer(typeof(Person)).ReadObject(r) as Person;
                Console.WriteLine(p);
            }
        }

        File.Delete(fileName);
    }
}

回复收藏 0 原文

ぇ气 2024-12-16 11:52:20

也许您的文件包含 BOM
UTF-8编码很常见

回复收藏 0 原文

め七分饶幸 2024-12-16 11:52:20

XmlSerializer xml = new XmlSerializer(typeof(MessageType));
XmlDocument xdoc = new XmlDocument();
xdoc.Load(stream);
foreach(XmlElement elm in xdoc.GetElementsByTagName("MessageType"))
{
    MessageType mt = (MessageType)xml.Deserialize(new StringReader(elm.OuterXml)); 
}

XmlSerializer xml = new XmlSerializer(typeof(MessageType));
XmlDocument xdoc = new XmlDocument();
xdoc.Load(stream);
foreach(XmlElement elm in xdoc.GetElementsByTagName("MessageType"))
{
    MessageType mt = (MessageType)xml.Deserialize(new StringReader(elm.OuterXml)); 
}

回复收藏 0 原文

~没有更多了~