如何解决“无法切换编码”的问题将 XML 插入 SQL Server 时出错

发布于 2024-09-24 22:51:39 字数 1535 浏览 6 评论 0 原文

我试图插入 XML 列(SQL SERVER 2008 R2),但服务器抱怨:

System.Data.SqlClient.SqlException(0x80131904):
XML解析:第1行,第39个字符,无法切换编码

我发现 XML 列必须是 UTF-16 才能成功插入。

我使用的代码是:

 XmlSerializer serializer = new XmlSerializer(typeof(MyMessage));
 StringWriter str = new StringWriter();
 serializer.Serialize(str, message);
 string messageToLog = str.ToString();

How can I serialize object to be in UTF-8 string?

编辑:好的,抱歉混淆了 - 字符串需要采用 UTF-8 格式。你是对的 - 默认情况下它是 UTF-16,如果我尝试插入 UTF-8 它会通过。那么问题就是如何序列化成UTF-8。

示例

这会在尝试插入 SQL Server 时导致错误:

    <?xml version="1.0" encoding="utf-16"?>
    <MyMessage>Teno</MyMessage>

这不会:

    <?xml version="1.0" encoding="utf-8"?>
    <MyMessage>Teno</MyMessage>

更新

我发现 SQL Server 2008 的 Xml列类型需要 utf-8,并且当您尝试插入的 xml 规范的 encoding 属性中使用 utf-16 时:

当您想要添加 utf-8 时,则像这样向 SQL 命令添加参数:

 sqlcmd.Parameters.Add("ParamName", SqlDbType.VarChar).Value = xmlValueToAdd;

如果您尝试在上一行中添加带有 encoding=utf-16 的 xmlValueToAdd,则会在插入时产生错误。此外,VarChar 意味着无法识别国家字符(它们显示为问号)。

要将 utf-16 添加到 db,请在前面的示例中使用 SqlDbType.NVarCharSqlDbType.Xml,或者根本不指定类型:

 sqlcmd.Parameters.Add(new SqlParameter("ParamName", xmlValueToAdd));

I'm trying to insert into XML column (SQL SERVER 2008 R2), but the server's complaining:

System.Data.SqlClient.SqlException (0x80131904):
XML parsing: line 1, character 39, unable to switch the encoding

I found out that the XML column has to be UTF-16 in order for the insert to succeed.

The code I'm using is:

 XmlSerializer serializer = new XmlSerializer(typeof(MyMessage));
 StringWriter str = new StringWriter();
 serializer.Serialize(str, message);
 string messageToLog = str.ToString();

How can I serialize object to be in UTF-8 string?

EDIT: Ok, sorry for the mixup - the string needs to be in UTF-8. You were right - it's UTF-16 by default, and if I try to insert in UTF-8 it passes. So the question is how to serialize into UTF-8.

Example

This causes errors while trying to insert into SQL Server:

    <?xml version="1.0" encoding="utf-16"?>
    <MyMessage>Teno</MyMessage>

This doesn't:

    <?xml version="1.0" encoding="utf-8"?>
    <MyMessage>Teno</MyMessage>

Update

I figured out when the SQL Server 2008 for its Xml column type needs utf-8, and when utf-16 in encoding property of the xml specification you're trying to insert:

When you want to add utf-8, then add parameters to SQL command like this:

 sqlcmd.Parameters.Add("ParamName", SqlDbType.VarChar).Value = xmlValueToAdd;

If you try to add the xmlValueToAdd with encoding=utf-16 in the previous row it would produce errors in insert. Also, the VarChar means that national characters aren't recognized (they turn out as question marks).

To add utf-16 to db, either use SqlDbType.NVarChar or SqlDbType.Xml in previous example, or just don't specify type at all:

 sqlcmd.Parameters.Add(new SqlParameter("ParamName", xmlValueToAdd));

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

橙幽之幻 2024-10-01 22:51:39

这个问题几乎与其他两个问题重复,令人惊讶的是 - 虽然这是最新的问题 - 我相信它缺少最佳答案。

重复项以及我认为最好的答案是:

最后,声明或使用什么编码并不重要,只要 XmlReader 可以在应用程序服务器本地解析它。

正如在 从 SQL Server 中的 XML 类型列读取 ADO.net 中的 XML 的最有效方法?,SQL Server 以高效的二进制格式存储 XML。通过使用 SqlXml< /a> 类,ADO.net 可以以这种二进制格式与 SQL Server 进行通信,并且不需要数据库服务器对 XML 进行任何序列化或反序列化。这对于网络传输来说也应该更加高效。

通过使用 SqlXml,XML 将被预先解析后发送到数据库,然后数据库不需要了解任何有关字符编码(UTF-16 或其他编码)的​​信息。特别要注意的是,XML 声明甚至不会与数据库中的数据一起保留,无论使用哪种方法插入它。

请参阅上面链接的答案,了解与此非常相似的方法,但这个示例是我的:

using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using System.IO;
using System.Xml;

static class XmlDemo {
    static void Main(string[] args) {
        using(SqlConnection conn = new SqlConnection()) {
            conn.ConnectionString = "...";
            conn.Open();

            using(SqlCommand cmd = new SqlCommand("Insert Into TestData(Xml) Values (@Xml)", conn)) {

                cmd.Parameters.Add(new SqlParameter("@Xml", SqlDbType.Xml) {
                    // Works.
                    // Value = "<Test/>"

                    // Works.  XML Declaration is not persisted!
                    // Value = "<?xml version=\"1.0\"?><Test/>"

                    // Works.  XML Declaration is not persisted!
                    // Value = "<?xml version=\"1.0\" encoding=\"UTF-16\"?><Test/>"

                    // Error ("unable to switch the encoding" SqlException).
                    // Value = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><Test/>"

                    // Works.  XML Declaration is not persisted!
                    Value = new SqlXml(XmlReader.Create(new StringReader("<?xml version=\"1.0\" encoding=\"UTF-8\"?><Test/>")))
                });

                cmd.ExecuteNonQuery();
            }
        }
    }
}

请注意,我不会认为最后一个(未注释的)示例是“生产就绪”的,而是将其保留为-是要简洁易读。如果操作正确,StringReader 和创建的 XmlReader 都应在 using 语句中初始化,以确保它们的 Close() 方法在完成时被调用。

据我所知,使用 XML 列时 XML 声明永远不会保留。例如,即使不使用.NET而仅使用这个直接的SQL插入语句,XML声明也不会与XML一起保存到数据库中:

Insert Into TestData(Xml) Values ('<?xml version="1.0" encoding="UTF-8"?><Test/>');

现在就OP的问题而言,要序列化的对象仍然需要转换为为此,仍然需要来自 MyMessage 对象的 XML 结构和 XmlSerializer。但是,在最坏的情况下,消息可以不是序列化为字符串,而是序列化为 XmlDocument - 然后可以通过新的 SqlXml “http://msdn.microsoft.com/en-us/library/system.xml.xmlnodereader.aspx”rel="nofollow noreferrer">XmlNodeReader - 避免反序列化/序列化行到一个字符串。 (请参阅http://blogs.msdn.com/b/jongallant/archive/2007/01/30/how-to-convert-xmldocument-to-xmlreader-for-sqlxml-data-type。 aspx 了解详细信息和示例。)

此处的所有内容都是针对 .NET 4.0 和 SQL Server 2008 R2 进行开发和测试的。

请不要浪费通过额外的转换(反序列化和序列化 - 到 DOM、字符串或其他方式)运行 XML,如此处和其他地方的其他答案所示。

This question is a near-duplicate of 2 others, and surprisingly - while this one is the most recent - I believe it is missing the best answer.

The duplicates, and what I believe to be their best answers, are:

In the end, it doesn't matter what encoding is declared or used, as long as the XmlReader can parse it locally within the application server.

As was confirmed in Most efficient way to read XML in ADO.net from XML type column in SQL server?, SQL Server stores XML in an efficient binary format. By using the SqlXml class, ADO.net can communicate with SQL Server in this binary format, and not require the database server to do any serialization or de-serialization of XML. This should also be more efficient for transport across the network.

By using SqlXml, XML will be sent pre-parsed to the database, and then the DB doesn't need to know anything about character encodings - UTF-16 or otherwise. In particular, note that the XML declarations aren't even persisted with the data in the database, regardless of which method is used to insert it.

Please refer to the above-linked answers for methods that look very similar to this, but this example is mine:

using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using System.IO;
using System.Xml;

static class XmlDemo {
    static void Main(string[] args) {
        using(SqlConnection conn = new SqlConnection()) {
            conn.ConnectionString = "...";
            conn.Open();

            using(SqlCommand cmd = new SqlCommand("Insert Into TestData(Xml) Values (@Xml)", conn)) {

                cmd.Parameters.Add(new SqlParameter("@Xml", SqlDbType.Xml) {
                    // Works.
                    // Value = "<Test/>"

                    // Works.  XML Declaration is not persisted!
                    // Value = "<?xml version=\"1.0\"?><Test/>"

                    // Works.  XML Declaration is not persisted!
                    // Value = "<?xml version=\"1.0\" encoding=\"UTF-16\"?><Test/>"

                    // Error ("unable to switch the encoding" SqlException).
                    // Value = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><Test/>"

                    // Works.  XML Declaration is not persisted!
                    Value = new SqlXml(XmlReader.Create(new StringReader("<?xml version=\"1.0\" encoding=\"UTF-8\"?><Test/>")))
                });

                cmd.ExecuteNonQuery();
            }
        }
    }
}

Note that I would not consider the last (non-commented) example to be "production-ready", but left it as-is to be concise and readable. If done properly, both the StringReader and the created XmlReader should be initialized within using statements to ensure that their Close() methods are called when complete.

From what I've seen, the XML declarations are never persisted when using an XML column. Even without using .NET and just using this direct SQL insert statement, for example, the XML declaration is not saved into the database with the XML:

Insert Into TestData(Xml) Values ('<?xml version="1.0" encoding="UTF-8"?><Test/>');

Now in terms of the OP's question, the object to be serialized still needs to be converted into an XML structure from the MyMessage object, and XmlSerializer is still needed for this. However, at worst, instead of serializing to a String, the message could instead be serialized to an XmlDocument - which can then be passed to SqlXml through a new XmlNodeReader - avoiding a de-serialization/serialization trip to a string. (See http://blogs.msdn.com/b/jongallant/archive/2007/01/30/how-to-convert-xmldocument-to-xmlreader-for-sqlxml-data-type.aspx for details and an example.)

Everything here was developed against and tested with .NET 4.0 and SQL Server 2008 R2.

Please don't make waste by running XML through extra conversions (de-deserializations and serializations - to DOM, strings, or otherwise), as shown in other answers here and elsewhere.

少女净妖师 2024-10-01 22:51:39

尽管 .net 字符串始终为 UTF-16,但您需要使用 UTF-16 编码来序列化对象。
应该是这样的:

public static string ToString(object source, Type type, Encoding encoding)
{
    // The string to hold the object content
    String content;

    // Create a memoryStream into which the data can be written and readed
    using (var stream = new MemoryStream())
    {
        // Create the xml serializer, the serializer needs to know the type
        // of the object that will be serialized
        var xmlSerializer = new XmlSerializer(type);

        // Create a XmlTextWriter to write the xml object source, we are going
        // to define the encoding in the constructor
        using (var writer = new XmlTextWriter(stream, encoding))
        {
            // Save the state of the object into the stream
            xmlSerializer.Serialize(writer, source);

            // Flush the stream
            writer.Flush();

            // Read the stream into a string
            using (var reader = new StreamReader(stream, encoding))
            {
                // Set the stream position to the begin
                stream.Position = 0;

                // Read the stream into a string
                content = reader.ReadToEnd();
            }
        }
    }

    // Return the xml string with the object content
    return content;
}

通过将编码设置为 Encoding.Unicode,不仅字符串将是 UTF-16,而且您还应该将 xml 字符串设为 UTF-16

<?xml version="1.0" encoding="utf-16"?>

Although a .net string is always UTF-16 you need to serialize the object using UTF-16 encoding.
That sould be something like this:

public static string ToString(object source, Type type, Encoding encoding)
{
    // The string to hold the object content
    String content;

    // Create a memoryStream into which the data can be written and readed
    using (var stream = new MemoryStream())
    {
        // Create the xml serializer, the serializer needs to know the type
        // of the object that will be serialized
        var xmlSerializer = new XmlSerializer(type);

        // Create a XmlTextWriter to write the xml object source, we are going
        // to define the encoding in the constructor
        using (var writer = new XmlTextWriter(stream, encoding))
        {
            // Save the state of the object into the stream
            xmlSerializer.Serialize(writer, source);

            // Flush the stream
            writer.Flush();

            // Read the stream into a string
            using (var reader = new StreamReader(stream, encoding))
            {
                // Set the stream position to the begin
                stream.Position = 0;

                // Read the stream into a string
                content = reader.ReadToEnd();
            }
        }
    }

    // Return the xml string with the object content
    return content;
}

By setting the encoding to Encoding.Unicode not only the string will be UTF-16 but you should also get the xml string as UTF-16.

<?xml version="1.0" encoding="utf-16"?>
芸娘子的小脾气 2024-10-01 22:51:39

最简单的解决方案不是告诉序列化器不要输出 XML 声明吗? .NET 和 SQL 应该对它们之间的其余部分进行排序。

        XmlSerializer serializer = new XmlSerializer(typeof(MyMessage));
        StringWriter str = new StringWriter();
        using (XmlWriter writer = XmlWriter.Create(str, new XmlWriterSettings { OmitXmlDeclaration = true }))
        {
            serializer.Serialize(writer, message);
        }
        string messageToLog = str.ToString();

Isn't the easiest solution to tell the serializer not to ouput the XML declaration? .NET and SQL should sort the rest out between them.

        XmlSerializer serializer = new XmlSerializer(typeof(MyMessage));
        StringWriter str = new StringWriter();
        using (XmlWriter writer = XmlWriter.Create(str, new XmlWriterSettings { OmitXmlDeclaration = true }))
        {
            serializer.Serialize(writer, message);
        }
        string messageToLog = str.ToString();
少年亿悲伤 2024-10-01 22:51:39

我花了很长时间才重新解决这个问题。

我正在对 SQL Server 执行 INSERT 语句,如下所示:

UPDATE Customers 
SET data = '<?xml version="1.0" encoding="utf-16"?><MyMessage>Teno</MyMessage>';

这给出了错误:

消息 9402,16 级,状态 1,第 2 行
XML解析:第1行,第39个字符,无法切换编码

真正非常简单的修复方法是:

UPDATE Customers 
SET data = N'<?xml version="1.0" encoding="utf-16"?><MyMessage>Teno</MyMessage>';

区别在于在 Unicode 字符串前添加 N 前缀:

N'Teno'

在前一种情况下,无前缀字符串被假定为 varchar(例如 Windows-1252 代码页)。当它在字符串中遇到 encoding="utf-16" 时,就会发生冲突(这是正确的,因为字符串不是 utf-16)。

修复方法是将字符串作为 nvarchar(即 UTF-16)传递到 SQL Server:

N''

这样,字符串 UTF-16,它与 XML 所说的 utf-16 编码相匹配。可以说,地毯与窗帘相匹配。

It took me forever to re-solve this problem.

I was doing an INSERT statement into SQL Server as something like:

UPDATE Customers 
SET data = '<?xml version="1.0" encoding="utf-16"?><MyMessage>Teno</MyMessage>';

and this gives the error:

Msg 9402, Level 16, State 1, Line 2
XML parsing: line 1, character 39, unable to switch the encoding

And the really, very simple fix is to:

UPDATE Customers 
SET data = N'<?xml version="1.0" encoding="utf-16"?><MyMessage>Teno</MyMessage>';

The difference is prefixing the Unicode string with N:

N'<?xml version="1.0" encoding="utf-16"?>Teno</MyMessage>'

In the former case, an unprefixed string is assumed to be varchar (e.g. Windows-1252 code-page). When it encounters the encoding="utf-16" inside the string, there is a conflict (and rightly so, since the string isn't utf-16).

The fix is to pass the string to SQL server as an nvarchar (i.e. UTF-16):

N'<?xml version="1.0" encoding="utf-16"?>'

That way the string is UTF-16, which matches the utf-16 encoding that the XML says it is. The carpet matches the curtains, so to speak.

心是晴朗的。 2024-10-01 22:51:39

@ziesemer 的答案(上面)是此问题以及此问题的链接重复项的唯一完全正确的答案。然而,它仍然需要更多的解释和澄清。将此视为@ziesemer 答案的扩展。


即使它们产生了预期的结果,这个问题的大多数答案(包括重复的问题)都是令人费解的,并且经历了许多不必要的步骤。这里的主要问题是总体上缺乏对 XML 数据类型在 SQL Server 中实际工作方式的了解(这并不奇怪,因为它没有很好的文档记录)。 XML 类型:

  1. 是一种高度优化(用于存储)的类型,它将传入的 XML 转换为二进制格式(在 msdn 站点中的某处进行了记录)。优化包括:
    1. 将数字和日期从字符串(如 XML 中的形式)转换为二进制表示形式如果元素或属性使用类型信息进行标记(这可能需要指定 XML 架构集合)。也就是说,数字“1234567”存储为 4 字节“int”,而不是 7 位数字的 14 字节 UTF-16 字符串。
    2. 元素和属性名称存储在字典中并给出一个数字 ID。该数字 ID 用于 XML 树结构。意思是,“...”以字符串形式占用27个字符(即54个字节),但存储在<中时只占用11个字符(即22个字节)。代码>XML 类型。这只是它的一个实例。多个实例占用 54 字节的额外倍数。但在 XML 类型中,每个实例仅占用该数字 ID 的空间,很可能是 4 字节 int。
  2. 始终将字符串存储为 UTF-16 Little Endian。这很可能就是不存储 XML 声明的原因:完全没有必要,因为“Encoding”属性永远不会改变,所以它总是相同的。
  3. 没有 XML 声明假定编码为 UTF-16,而不是 UTF-8。
  4. 可以传入 8 位/非 UTF-16 数据。在这种情况下,您需要确保该字符串不是 NVARCHAR 字符串(即不以大写“N”作为文字前缀,在处理 T-SQL 变量时不声明为 NVARCHAR,并且在 .NET 中不声明为 SqlDbType.NVarChar )。并且,您需要确保您确实具有XML声明,并且它指定了正确的编码。

    PRINT 'VARCHAR / UTF-8:';
    声明@XML_VC_8 XML;
    SET @XML_VC_8 = '';
    打印“成功!”
     -  成功!
    
    去
    打印 '';
    打印'NVARCHAR / UTF-8:';
    声明@XML_NVC_8 XML;
    SET @XML_NVC_8 = N'';
    打印“成功!”
    /*
    消息 9402,第 16 层,状态 1,第 XXXXX 行
    XML解析:第1行,第38个字符,无法切换编码
    */
    
    去
    打印 '';
    打印'VARCHAR / UTF-16:';
    声明@XML_VC_16 XML;
    SET @XML_VC_16 = '';
    打印“成功!”
    /*
    消息 9402,第 16 层,状态 1,第 XXXXX 行
    XML解析:第1行,第38个字符,无法切换编码
    */
    
    去
    打印 '';
    打印'NVARCHAR / UTF-16:';
    声明@XML_NVC_16 XML;
    SET @XML_NVC_16 = N'';
    打印“成功!”
     -  成功!
    

    如您所见,当输入字符串为NVARCHAR时,可以包含XML声明,但必须为“UTF-16”。

  5. 当输入字符串为 VARCHAR 时,可以包含 XML 声明,但不能为“UTF-16”。但是,它可以是任何有效的 8 位编码,在这种情况下,该编码的字节将转换为 UTF-16,如下所示:

    声明 @XML XML;
    SET @XML = '';
    选择@XML;
    -- <测试 attr="

@ziesemer's answer (above) is the only fully correct answer to this question and the linked duplicates of this question. However, it could still use a little more explanation and some clarification. Consider this as an extension of @ziesemer's answer.


Even if they produce the desired result, most answers to this question (including the duplicate question) are convoluted and go through many unnecessary steps. The main issue here is the overall lack of understanding regarding how the XML datatype actually works in SQL Server (not surprising given that it isn't well documented). The XML type:

  1. Is a highly optimized (for storage) type that converts the incoming XML into a binary format (which is documented somewhere in the msdn site). The optimizations include:
    1. Converting numbers and dates from string (as they are in the XML) into binary representations IF the element or attribute is tagged with the type info (this might require specifying an XML Schema Collection). Meaning, the number "1234567" is stored as a 4-byte "int" instead of a 14-byte UTF-16 string of 7 digits.
    2. Element and Attribute names are stored in a dictionary and given a numeric ID. That numeric ID is used in the XML tree structure. Meaning, "<ElementName>...</ElementName>" takes up 27 character (i.e. 54 bytes) in string form, but only 11 characters (i.e. 22 bytes) when stored in the XML type. And that is for a single instance of it. Multiple instances take up additional multiples of the 54 bytes. But in the XML type, each instance only takes up the space of that numeric ID, most likely a 4-byte int.
  2. Stores strings as UTF-16 Little Endian, always. This is most likely why the XML declaration is not stored: it is entirely unnecessary as it is always the same since the "Encoding" attribute cannot ever change.
  3. No XML declaration assumes the encoding to be UTF-16, not UTF-8.
  4. Can have 8-bit / non-UTF-16 data passed in. In this case, you need to make sure that the string is not an NVARCHAR string (i.e. not prefixed with an upper-case "N" for literals, not declared as NVARCHAR when dealing with T-SQL variables, and not declared as SqlDbType.NVarChar in .NET). AND, you need to make sure that you do have the XML declaration, and that it specifies the correct encoding.

    PRINT 'VARCHAR / UTF-8:';
    DECLARE @XML_VC_8 XML;
    SET @XML_VC_8 = '<?xml version="1.0" encoding="utf-8"?><test/>';
    PRINT 'Success!'
    -- Success!
    
    GO
    PRINT '';
    PRINT 'NVARCHAR / UTF-8:';
    DECLARE @XML_NVC_8 XML;
    SET @XML_NVC_8 = N'<?xml version="1.0" encoding="utf-8"?><test/>';
    PRINT 'Success!'
    /*
    Msg 9402, Level 16, State 1, Line XXXXX
    XML parsing: line 1, character 38, unable to switch the encoding
    */
    
    GO
    PRINT '';
    PRINT 'VARCHAR / UTF-16:';
    DECLARE @XML_VC_16 XML;
    SET @XML_VC_16 = '<?xml version="1.0" encoding="utf-16"?><test/>';
    PRINT 'Success!'
    /*
    Msg 9402, Level 16, State 1, Line XXXXX
    XML parsing: line 1, character 38, unable to switch the encoding
    */
    
    GO
    PRINT '';
    PRINT 'NVARCHAR / UTF-16:';
    DECLARE @XML_NVC_16 XML;
    SET @XML_NVC_16 = N'<?xml version="1.0" encoding="utf-16"?><test/>';
    PRINT 'Success!'
    -- Success!
    

    As you can see, when the input string is NVARCHAR, then the XML declaration can be included, but it needs to be "UTF-16".

  5. When the input string is VARCHAR then the XML declaration can be included, but it cannot be "UTF-16". It can, however, be any valid 8-bit encoding, in which case the bytes for that encoding will be converted into UTF-16, as shown below:

    DECLARE @XML XML;
    SET @XML = '<?xml version="1.0" encoding="utf-8"?><test attr="'
               + CHAR(0xF0) + CHAR(0x9F) + CHAR(0x98) + CHAR(0x8E) + '"/>';
    SELECT @XML;
    -- <test attr="????" />
    
    
    SET @XML = '<?xml version="1.0" encoding="Windows-1255"?><test attr="'
               + CONVERT(VARCHAR(10), 0xF9ECE5ED) + '"/>';
    SELECT @XML AS [XML from Windows-1255],
           CONVERT(VARCHAR(10), 0xF9ECE5ED) AS [Latin1_General / Windows-1252];
    /*
    XML from Windows-1255    Latin1_General / Windows-1252
    <test attr="שלום" />     ùìåí
    */
    

    The first example specifies the 4-byte UTF-8 sequence for Smiling Face with Sunglasses and it get converted correctly.
    The second example uses 4 bytes to represent 4 Hebrew letters making up the word "Shalom", which is converted correctly, and displayed correctly given that the "F9" byte, which is first, is the ש character, which is on the right-side of the word (since Hebrew is a right-to-left language). Yet those same 4 bytes display as ùìåí when selected directly since the default Collation for the current DB is Latin1_General_100_CS_AS_SC.

妄想挽回 2024-10-01 22:51:39

.NET 中的字符串始终为 UTF-16,因此只要您留在托管应用程序中,就不必关心它是什么编码。

问题更有可能出现在与 SQL 服务器对话的地方。您的问题没有显示该代码,因此很难找出确切的错误。我的建议是您检查是否可以在该代码上设置一个属性或属性来指定发送到服务器的数据的编码。

A string is always UTF-16 in .NET, so as long as you stay inside your managed app you don't have to care about which encoding it is.

The problem is more likely where you talk to the SQL server. Your question doesn't show that code so it's hard to pin point the exact error. My suggestion is you check if there's a property or attribute you can set on that code that specifies the encoding of the data sent to the server.

玩心态 2024-10-01 22:51:39

您正在序列化为字符串而不是字节数组,因此此时尚未发生任何编码。

“messageToLog”的开头是什么样的? XML 指定的编码(例如utf-8)是否随后被证明是错误的?

编辑

根据您的进一步信息,听起来字符串在传递到数据库时自动转换为utf-8,但数据库会阻塞,因为XML声明说它是utf-16。

在这种情况下,您不需要序列化为 utf-8。您需要使用 XML 中省略的“encoding=”进行序列化。 XmlFragmentWriter(不是 .Net 的标准部分,请 Google 一下)可以让您执行此操作。

You are serializing to a string rather than a byte array so, at this point, any encoding hasn't happened yet.

What does the start of "messageToLog" look like? Is the XML specifying an encoding (e.g. utf-8) which subsequently turns out to be wrong?

Edit

Based on your further info it sounds like the string is automatically converted to utf-8 when it is passed to the database, but the database chokes because the XML declaration says it is utf-16.

In which case, you don't need to serialize to utf-8. You need to serialize with the "encoding=" omitted from the XML. The XmlFragmentWriter (not a standard part of .Net, Google it) lets you do this.

风苍溪 2024-10-01 22:51:39

xml 序列化程序的默认编码应为 UTF-16。只是为了确保你可以尝试 -

XmlSerializer serializer = new XmlSerializer(typeof(YourObject));

// create a MemoryStream here, we are just working
// exclusively in memory
System.IO.Stream stream = new System.IO.MemoryStream();

// The XmlTextWriter takes a stream and encoding
// as one of its constructors
System.Xml.XmlTextWriter xtWriter = new System.Xml.XmlTextWriter(stream, Encoding.UTF16);

serializer.Serialize(xtWriter, yourObjectInstance);

xtWriter.Flush();

Default encoding for a xml serializer should be UTF-16. Just to make sure you can try -

XmlSerializer serializer = new XmlSerializer(typeof(YourObject));

// create a MemoryStream here, we are just working
// exclusively in memory
System.IO.Stream stream = new System.IO.MemoryStream();

// The XmlTextWriter takes a stream and encoding
// as one of its constructors
System.Xml.XmlTextWriter xtWriter = new System.Xml.XmlTextWriter(stream, Encoding.UTF16);

serializer.Serialize(xtWriter, yourObjectInstance);

xtWriter.Flush();
够运 2024-10-01 22:51:39

试试这个:
删除 xml 文本中的编码,

如下所示

xmlTXT = xmlTXT.Replace("encoding = \ " utf-8 \ " ", "");

或者这个...
xmlTXT = xmlTXT.Replace("encoding="utf-8"", "");

try this:
delete encoding in xml text

sothing like this

xmlTXT = xmlTXT.Replace("encoding = \ " utf-8 \ " ", "");

or this...
xmlTXT = xmlTXT.Replace("encoding="utf-8"", "");

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文