使用特殊字符快速反序列化 XML 的方法

发布于 2024-10-16 01:40:03 字数 332 浏览 6 评论 0 原文

我正在寻找快速反序列化 xml 的方法,其中包含特殊字符,如 ö。

我正在使用 XMLReader,但它无法反序列化此类字符。

有什么建议吗?

编辑:我正在使用 C#。 代码如下:

XElement element =.. //has the xml
XmlSerializer serializer =   new XmlSerializer(typeof(MyType));
XmlReader reader = element.CreateReader();
Object o= serializer.Deserialize(reader);

I am looking for fast way to deserialize xml, that has special characters in it like ö.

I was using XMLReader and it fails to deserialze such characters.

Any suggestion?

EDIT: I am using C#.
Code is as follows:

XElement element =.. //has the xml
XmlSerializer serializer =   new XmlSerializer(typeof(MyType));
XmlReader reader = element.CreateReader();
Object o= serializer.Deserialize(reader);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

恰似旧人归 2024-10-23 01:40:03

我猜您遇到了编码问题,而不是在 XMLReader 但使用 XmlSerializer

您可以使用XmlTextWriterXmlSerializer 就像下面的代码片段一样(请参阅下面的通用方法以获得更好的实现方式)。与元音变音 (äöü) 和其他特殊字符配合得很好。

class Program
{
    static void Main(string[] args)
    {
        SpecialCharacters specialCharacters = new SpecialCharacters { Umlaute = "äüö" };

        // serialize object to xml

        MemoryStream memoryStreamSerialize = new MemoryStream();
        XmlSerializer xmlSerializerSerialize = new XmlSerializer(typeof(SpecialCharacters));
        XmlTextWriter xmlTextWriterSerialize = new XmlTextWriter(memoryStreamSerialize, Encoding.UTF8);

        xmlSerializerSerialize.Serialize(xmlTextWriterSerialize, specialCharacters);
        memoryStreamSerialize = (MemoryStream)xmlTextWriterSerialize.BaseStream;

        // converts a byte array of unicode values (UTF-8 enabled) to a string
        UTF8Encoding encodingSerialize = new UTF8Encoding();
        string serializedXml = encodingSerialize.GetString(memoryStreamSerialize.ToArray());

        xmlTextWriterSerialize.Close();
        memoryStreamSerialize.Close();
        memoryStreamSerialize.Dispose();

        // deserialize xml to object

        // converts a string to a UTF-8 byte array.
        UTF8Encoding encodingDeserialize = new UTF8Encoding();
        byte[] byteArray = encodingDeserialize.GetBytes(serializedXml);

        using (MemoryStream memoryStreamDeserialize = new MemoryStream(byteArray))
        {
            XmlSerializer xmlSerializerDeserialize = new XmlSerializer(typeof(SpecialCharacters));
            XmlTextWriter xmlTextWriterDeserialize = new XmlTextWriter(memoryStreamDeserialize, Encoding.UTF8);

            SpecialCharacters deserializedObject = (SpecialCharacters)xmlSerializerDeserialize.Deserialize(xmlTextWriterDeserialize.BaseStream);
        }
    }
}

[Serializable]
public class SpecialCharacters
{
    public string Umlaute { get; set; }
}

我个人使用以下通用方法来序列化和反序列化 XML 和对象,并且尚未遇到任何性能或编码问题。

public static string SerializeObjectToXml<T>(T obj)
{
    MemoryStream memoryStream = new MemoryStream();
    XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));
    XmlTextWriter xmlTextWriter = new XmlTextWriter(memoryStream, Encoding.UTF8);

    xmlSerializer.Serialize(xmlTextWriter, obj);
    memoryStream = (MemoryStream)xmlTextWriter.BaseStream;

    string xmlString = ByteArrayToStringUtf8(memoryStream.ToArray());

    xmlTextWriter.Close();
    memoryStream.Close();
    memoryStream.Dispose();

    return xmlString;
}

public static T DeserializeXmlToObject<T>(string xml)
{
    using (MemoryStream memoryStream = new MemoryStream(StringToByteArrayUtf8(xml)))
    {
        XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));

        using (StreamReader xmlStreamReader = new StreamReader(memoryStream, Encoding.UTF8))
        {
            return (T)xmlSerializer.Deserialize(xmlStreamReader);
        }
    }
}

public static string ByteArrayToStringUtf8(byte[] value)
{
    UTF8Encoding encoding = new UTF8Encoding();
    return encoding.GetString(value);
}

public static byte[] StringToByteArrayUtf8(string value)
{
    UTF8Encoding encoding = new UTF8Encoding();
    return encoding.GetBytes(value);
}

I'd guess you're having an encoding issue, not in the XMLReader but with the XmlSerializer.

You could use the XmlTextWriter and UTF8 encoding with the XmlSerializer like in the following snippet (see the generic methods below for a way nicer implementation of it). Works just fine with umlauts (äöü) and other special characters.

class Program
{
    static void Main(string[] args)
    {
        SpecialCharacters specialCharacters = new SpecialCharacters { Umlaute = "äüö" };

        // serialize object to xml

        MemoryStream memoryStreamSerialize = new MemoryStream();
        XmlSerializer xmlSerializerSerialize = new XmlSerializer(typeof(SpecialCharacters));
        XmlTextWriter xmlTextWriterSerialize = new XmlTextWriter(memoryStreamSerialize, Encoding.UTF8);

        xmlSerializerSerialize.Serialize(xmlTextWriterSerialize, specialCharacters);
        memoryStreamSerialize = (MemoryStream)xmlTextWriterSerialize.BaseStream;

        // converts a byte array of unicode values (UTF-8 enabled) to a string
        UTF8Encoding encodingSerialize = new UTF8Encoding();
        string serializedXml = encodingSerialize.GetString(memoryStreamSerialize.ToArray());

        xmlTextWriterSerialize.Close();
        memoryStreamSerialize.Close();
        memoryStreamSerialize.Dispose();

        // deserialize xml to object

        // converts a string to a UTF-8 byte array.
        UTF8Encoding encodingDeserialize = new UTF8Encoding();
        byte[] byteArray = encodingDeserialize.GetBytes(serializedXml);

        using (MemoryStream memoryStreamDeserialize = new MemoryStream(byteArray))
        {
            XmlSerializer xmlSerializerDeserialize = new XmlSerializer(typeof(SpecialCharacters));
            XmlTextWriter xmlTextWriterDeserialize = new XmlTextWriter(memoryStreamDeserialize, Encoding.UTF8);

            SpecialCharacters deserializedObject = (SpecialCharacters)xmlSerializerDeserialize.Deserialize(xmlTextWriterDeserialize.BaseStream);
        }
    }
}

[Serializable]
public class SpecialCharacters
{
    public string Umlaute { get; set; }
}

I personally use the follwing generic methods to serialize and deserialize XML and objects and haven't had any performance or encoding issues yet.

public static string SerializeObjectToXml<T>(T obj)
{
    MemoryStream memoryStream = new MemoryStream();
    XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));
    XmlTextWriter xmlTextWriter = new XmlTextWriter(memoryStream, Encoding.UTF8);

    xmlSerializer.Serialize(xmlTextWriter, obj);
    memoryStream = (MemoryStream)xmlTextWriter.BaseStream;

    string xmlString = ByteArrayToStringUtf8(memoryStream.ToArray());

    xmlTextWriter.Close();
    memoryStream.Close();
    memoryStream.Dispose();

    return xmlString;
}

public static T DeserializeXmlToObject<T>(string xml)
{
    using (MemoryStream memoryStream = new MemoryStream(StringToByteArrayUtf8(xml)))
    {
        XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));

        using (StreamReader xmlStreamReader = new StreamReader(memoryStream, Encoding.UTF8))
        {
            return (T)xmlSerializer.Deserialize(xmlStreamReader);
        }
    }
}

public static string ByteArrayToStringUtf8(byte[] value)
{
    UTF8Encoding encoding = new UTF8Encoding();
    return encoding.GetString(value);
}

public static byte[] StringToByteArrayUtf8(string value)
{
    UTF8Encoding encoding = new UTF8Encoding();
    return encoding.GetBytes(value);
}
染墨丶若流云 2024-10-23 01:40:03

对我有用的方法类似于@martin-buberl 的建议:

public static T DeserializeXmlToObject<T>(string xml)
{
    using (MemoryStream memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(xml)))
    {
        XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));
        StreamReader reader = new StreamReader(memoryStream, Encoding.UTF8);
        return (T)xmlSerializer.Deserialize(reader);
    }
}

What works for me is similar to what @martin-buberl suggested:

public static T DeserializeXmlToObject<T>(string xml)
{
    using (MemoryStream memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(xml)))
    {
        XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));
        StreamReader reader = new StreamReader(memoryStream, Encoding.UTF8);
        return (T)xmlSerializer.Deserialize(reader);
    }
}
冰雪梦之恋 2024-10-23 01:40:03

最简单的方法是将字符从任何编码转换为 Base64 编码。 Base64 可转换可打印字符列表中的任何字符串,从而无需执行“5000 次转换”。

Serializable_Class class = new Serializable_Class();

string xml_string1 = SOME_XML1;
string xml_string2 = SOME_XML2;
string xml_string3 = SOME_XML3;

class .item1 = Convert.ToBase64String(Encoding.UTF8.GetBytes(xml_string1));
class .item2 = Convert.ToBase64String(Encoding.UTF8.GetBytes(xml_string2));
class .item3 = Convert.ToBase64String(Encoding.UTF8.GetBytes(xml_string3));

System.IO.MemoryStream payload_stream = new System.IO.MemoryStream();

System.Xml.Serialization.XmlSerializer payload_generator = new System.Xml.Serialization.XmlSerializer(class.GetType());

payload_generator.Serialize(payload_stream, class);

byte[] serialised_class = payload_stream.ToArray();

payload_stream.Close();
payload_stream.Dispose();

将要序列化的类的字符串必须转换为 Base64 字符串。必须启动 MemoryStream 对象才能操作内存中序列化过程的二进制信息。然后,必须创建 XmlSerializer 对象以使用 Serialize() 方法序列化该对象。 MemoryStream 对象必须作为参数传递,以便 XmlSerializer 能够操作内存中的数据。序列化完成后,可以通过调用ToArray()方法从MemoryStream的二进制数据中提取序列化后的对象,得到MemoryStream内的所有二进制信息作为字节数组。

The simplest way of doing this is to transform the characters from any encoding to the Base64 encoding. The Base64 transforms any string in a list of printable characters, thus removing the need to do "5000 conversions".

Serializable_Class class = new Serializable_Class();

string xml_string1 = SOME_XML1;
string xml_string2 = SOME_XML2;
string xml_string3 = SOME_XML3;

class .item1 = Convert.ToBase64String(Encoding.UTF8.GetBytes(xml_string1));
class .item2 = Convert.ToBase64String(Encoding.UTF8.GetBytes(xml_string2));
class .item3 = Convert.ToBase64String(Encoding.UTF8.GetBytes(xml_string3));

System.IO.MemoryStream payload_stream = new System.IO.MemoryStream();

System.Xml.Serialization.XmlSerializer payload_generator = new System.Xml.Serialization.XmlSerializer(class.GetType());

payload_generator.Serialize(payload_stream, class);

byte[] serialised_class = payload_stream.ToArray();

payload_stream.Close();
payload_stream.Dispose();

The strings of the class that will be serialised must be converted to Base64 strings. A MemoryStream object must be initiated in order to manipulate the binary information of the serialisation process in memory. Then, an XmlSerializer object must be created to serialise the object using the Serialize() method. The MemoryStream object must be passed as a parameter in order for the XmlSerializer to manipulate the data in memory. After the serialisation is finished, the serialised object can be extracted from the binary data of the MemoryStream by calling the ToArray() method to get all the binary information within the MemoryStream as a byte array.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文