使用采用 ISO-8859-1 编码的 XmlTextWriter 写入 XML 文件
我在使用 C# 将挪威语字符写入 XML 文件时遇到问题。 我有一个字符串变量,其中包含一些挪威语文本(带有类似 æøå 的字母)。
我正在使用 XmlTextWriter 编写 XML,将内容写入 MemoryStream,如下所示:
MemoryStream stream = new MemoryStream();
XmlTextWriter xmlTextWriter = new XmlTextWriter(stream, Encoding.GetEncoding("ISO-8859-1"));
xmlTextWriter.Formatting = Formatting.Indented;
xmlTextWriter.WriteStartDocument(); //Start doc
然后我添加我的挪威语文本,如下所示:
xmlTextWriter.WriteCData(myNorwegianText);
然后我将文件写入磁盘,如下所示:
FileStream myFile = new FileStream(myPath, FileMode.Create);
StreamWriter sw = new StreamWriter(myFile);
stream.Position = 0;
StreamReader sr = new StreamReader(stream);
string content = sr.ReadToEnd();
sw.Write(content);
sw.Flush();
myFile.Flush();
myFile.Close();
现在的问题是,在此文件中,所有挪威的人物看起来很有趣。
我可能正在以某种愚蠢的方式做上述事情。 关于如何修复它有什么建议吗?
I'm having a problem writing Norwegian characters into an XML file using C#. I have a string variable containing some Norwegian text (with letters like æøå).
I'm writing the XML using an XmlTextWriter, writing the contents to a MemoryStream like this:
MemoryStream stream = new MemoryStream();
XmlTextWriter xmlTextWriter = new XmlTextWriter(stream, Encoding.GetEncoding("ISO-8859-1"));
xmlTextWriter.Formatting = Formatting.Indented;
xmlTextWriter.WriteStartDocument(); //Start doc
Then I add my Norwegian text like this:
xmlTextWriter.WriteCData(myNorwegianText);
Then I write the file to disk like this:
FileStream myFile = new FileStream(myPath, FileMode.Create);
StreamWriter sw = new StreamWriter(myFile);
stream.Position = 0;
StreamReader sr = new StreamReader(stream);
string content = sr.ReadToEnd();
sw.Write(content);
sw.Flush();
myFile.Flush();
myFile.Close();
Now the problem is that in the file on this, all the Norwegian characters look funny.
I'm probably doing the above in some stupid way. Any suggestions on how to fix it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
为什么首先将 XML 写入 MemoryStream,然后再将其写入实际文件流? 这是相当低效的。 如果直接写入 FileStream 它应该可以工作。
如果您仍然想进行双重写入,无论出于何种原因,请执行以下两件事之一。
不要使用 StreamReader/StreamWriter。 相反,只需使用简单的 byte[] 和 Stream.Read/Write 在字节级别复制流。 顺便说一句,无论如何,这都会更加高效。
Why are you writing the XML first to a MemoryStream and then writing that to the actual file stream? That's pretty inefficient. If you write directly to the FileStream it should work.
If you still want to do the double write, for whatever reason, do one of two things. Either
Make sure that the StreamReader and StreamWriter objects you use all use the same encoding as the one you used with the XmlWriter (not just the StreamWriter, like someone else suggested), or
Don't use StreamReader/StreamWriter. Instead just copy the stream at the byte level using a simple byte[] and Stream.Read/Write. This is going to be, btw, a lot more efficient anyway.
您的 StreamWriter 和 StreamReader 都使用 UTF-8,因为您没有指定编码。 这就是事情变得腐败的原因。
正如 tomasr 所说,使用 FileStream 开始会更简单 - 而且 MemoryStream 具有方便的“WriteTo”方法,可以让您轻松地将其复制到 FileStream。
顺便说一句,我希望您在实际代码中有一个 using 语句 - 如果在写入文件时出现问题,您不想让文件句柄保持打开状态。
乔恩
Both your StreamWriter and your StreamReader are using UTF-8, because you're not specifying the encoding. That's why things are getting corrupted.
As tomasr said, using a FileStream to start with would be simpler - but also MemoryStream has the handy "WriteTo" method which lets you copy it to a FileStream very easily.
I hope you've got a using statement in your real code, by the way - you don't want to leave your file handle open if something goes wrong while you're writing to it.
Jon
每次写入字符串或将二进制数据读取为字符串时,都需要设置编码。
You need to set the encoding everytime you write a string or read binary data as a string.
正如上面的答案中提到的,这里最大的问题是
Encoding
,由于未指定而被默认。当您没有为此类转换指定
Encoding
时,将使用默认的UTF-8
- 这可能符合您的情况,也可能不符合您的情况。 您还通过将数据推入MemoryStream
然后将其推入FileStream
来不必要地转换数据。如果您的原始数据不是
UTF-8
,这里会发生的情况是,第一次转换到MemoryStream
时将尝试使用默认的Encoding
进行解码UTF-8
- 并因此损坏您的数据。 然后,当您写入FileStream
(默认情况下也使用UTF-8
)作为编码时,您只需将损坏保留到文件中即可。为了解决此问题,您可能需要在
Stream
对象中指定Encoding
。实际上,您也可以完全跳过
MemoryStream
过程 - 这会更快、更高效。 您更新后的代码可能看起来更像是:As mentioned in above answers, the biggest issue here is the
Encoding
, which is being defaulted due to being unspecified.When you do not specify an
Encoding
for this kind of conversion, the default ofUTF-8
is used - which may or may not match your scenario. You are also converting the data needlessly by pushing it into aMemoryStream
and then out into aFileStream
.If your original data is not
UTF-8
, what will happen here is that the first transition into theMemoryStream
will attempt to decode using defaultEncoding
ofUTF-8
- and corrupt your data as a result. When you then write out to theFileStream
, which is also usingUTF-8
as encoding by default, you simply persist that corruption into the file.In order to fix the issue, you likely need to specify
Encoding
into yourStream
objects.You can actually skip the
MemoryStream
process entirely, also - which will be faster and more efficient. Your updated code might look something more like:您使用哪种编码来显示结果文件? 如果不是 ISO-8859-1,则无法正确显示。
是否有理由使用这种特定的编码,而不是 UTF8?
Which encoding do you use for displaying the result file? If it is not in ISO-8859-1, it will not display correctly.
Is there a reason to use this specific encoding, instead of for example UTF8?
经过调查,这对我来说最有效:
After investigating, this is that worked best for me: