XDocument：将 XML 保存到没有 BOM 的文件

发布于 2024-10-16 17:20:05 字数 861 浏览 19 评论 0原文

我正在使用 XDocument 生成 utf-8 XML 文件。

XDocument xml_document = new XDocument(
                    new XDeclaration("1.0", "utf-8", null),
                    new XElement(ROOT_NAME,                    
                    new XAttribute("note", note)
                )
            );
...
xml_document.Save(@file_path);

该文件已正确生成，并已成功使用 xsd 文件进行验证。

当我尝试将 XML 文件上传到在线服务时，该服务显示我的文件第 1 行错误；我发现问题是由文件第一个字节上的 BOM 引起的。

您知道为什么 BOM 附加到文件中以及如何在没有 BOM 的情况下保存文件吗？

正如字节顺序标记维基百科文章中所述：

虽然 Unicode 标准允许 BOM UTF-8 它不需要或推荐它。字节顺序没有 UTF-8 中的含义，因此仅包含 BOM 用于识别文本流或文件为 UTF-8 或已转换来自另一种具有 BOM 的格式

这是 XDocument 问题还是我应该联系在线服务提供商的人员要求解析器升级？

原文

I'm generating an utf-8 XML file using XDocument.

XDocument xml_document = new XDocument(
                    new XDeclaration("1.0", "utf-8", null),
                    new XElement(ROOT_NAME,                    
                    new XAttribute("note", note)
                )
            );
...
xml_document.Save(@file_path);

The file is generated correctly and validated with an xsd file with success.

When I try to upload the XML file to an online service, the service says that my file is wrong at line 1; I have discovered that the problem is caused by the BOM on the first bytes of the file.

Do you know why the BOM is appended to the file and how can I save the file without it?

As stated in Byte order mark Wikipedia article:

While Unicode standard allows BOM in
UTF-8 it does not require or
recommend it. Byte order has no
meaning in UTF-8 so a BOM only
serves to identify a text stream or
file as UTF-8 or that it was converted
from another format that has a BOM

Is it an XDocument problem or should I contact the guys of the online service provider to ask for a parser upgrade?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

那一片橙海， 2024-10-23 17:20:05

使用 XmlTextWriter 并将其传递给 XDocument 的 Save() 方法，这样您就可以更好地控制所使用的编码类型：

var doc = new XDocument(
    new XDeclaration("1.0", "utf-8", null),
    new XElement("root", new XAttribute("note", "boogers"))
);
using (var writer = new XmlTextWriter(".\\boogers.xml", new UTF8Encoding(false)))
{
    doc.Save(writer);
}

UTF8Encoding 类构造函数有一个重载，指定是否将 BOM（字节顺序标记）与布尔值一起使用，在您的情况下为 false。

使用 Notepad++ 检查文件的编码来验证此代码的结果。

Use an XmlTextWriter and pass that to the XDocument's Save() method, that way you can have more control over the type of encoding used:

var doc = new XDocument(
    new XDeclaration("1.0", "utf-8", null),
    new XElement("root", new XAttribute("note", "boogers"))
);
using (var writer = new XmlTextWriter(".\\boogers.xml", new UTF8Encoding(false)))
{
    doc.Save(writer);
}

The UTF8Encoding class constructor has an overload that specifies whether or not to use the BOM (Byte Order Mark) with a boolean value, in your case false.

The result of this code was verified using Notepad++ to inspect the file's encoding.

回复收藏 0 原文

薄荷港 2024-10-23 17:20:05

首先：服务提供商必须根据 XML 规范来处理它，该规范规定在 UTF-8 表示的情况下可能会出现 BOM。

您可以强制保存不带 BOM 的 XML，如下所示：（

XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = new UTF8Encoding(false); // The false means, do not emit the BOM.
using (XmlWriter w = XmlWriter.Create("my.xml", settings))
{
    doc.Save(w);
}

从此处 Google 搜索：http://social.msdn.microsoft.com/Forums/en/xmlandnetfx/thread/ccc08c65-01d7-43c6-adf3-1fc70fdb026a)

First of all: the service provider MUST handle it, according to XML spec, which states that BOM may be present in case of UTF-8 representation.

You can force to save your XML without BOM like this:

XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = new UTF8Encoding(false); // The false means, do not emit the BOM.
using (XmlWriter w = XmlWriter.Create("my.xml", settings))
{
    doc.Save(w);
}

(Googled from here: http://social.msdn.microsoft.com/Forums/en/xmlandnetfx/thread/ccc08c65-01d7-43c6-adf3-1fc70fdb026a)

回复收藏 0 原文

So要识趣 2024-10-23 17:20:05

使用XDocument时摆脱BOM字符的最便捷方法是仅保存文档，然后直接将File读取为文件，然后将其写回。文件例程将为您删除该字符：（

        XDocument xTasks = new XDocument();
        XElement xRoot = new XElement("tasklist",
            new XAttribute("timestamp",lastUpdated),
            new XElement("lasttask",lastTask)
        );
        ...
        xTasks.Add(xRoot);
        xTasks.Save("tasks.xml");

        // read it straight in, write it straight back out. Done.
        string[] lines = File.ReadAllLines("tasks.xml");
        File.WriteAllLines("tasks.xml",lines);

这很做作，但它是为了方便起见 - 至少您将有一个格式良好的文件上传到您的在线提供商）；）

The most expedient way to get rid of the BOM character when using XDocument is to just save the document, then do a straight File read as a file, then write it back out. The File routines will strip the character out for you:

        XDocument xTasks = new XDocument();
        XElement xRoot = new XElement("tasklist",
            new XAttribute("timestamp",lastUpdated),
            new XElement("lasttask",lastTask)
        );
        ...
        xTasks.Add(xRoot);
        xTasks.Save("tasks.xml");

        // read it straight in, write it straight back out. Done.
        string[] lines = File.ReadAllLines("tasks.xml");
        File.WriteAllLines("tasks.xml",lines);

(it's hoky, but it works for the sake of expediency - at least you'll have a well-formed file to upload to your online provider) ;)

回复收藏 0 原文

墨小沫ゞ 2024-10-23 17:20:05

通过 UTF-8 文档

String XMLDec = xDoc.Declaration.ToString();
StringBuilder sb = new StringBuilder(XMLDec);
sb.Append(xDoc.ToString());
Encoding encoding = new UTF8Encoding(false); // false = without BOM
File.WriteAllText(outPath, sb.ToString(), encoding);

By UTF-8 Documents

String XMLDec = xDoc.Declaration.ToString();
StringBuilder sb = new StringBuilder(XMLDec);
sb.Append(xDoc.ToString());
Encoding encoding = new UTF8Encoding(false); // false = without BOM
File.WriteAllText(outPath, sb.ToString(), encoding);

回复收藏 0 原文

~没有更多了~