避免 XmlDocument 在 C# 中验证命名空间

发布于 2024-08-30 13:25:35 字数 1518 浏览 13 评论 0原文

我正在尝试找到一种缩进 HTML 文件的方法，我一直在使用 XMLDocument 并且只使用 XmlTextWriter。

但是我无法正确格式化 HTML 文档，因为它会检查文档类型并尝试下载它。

是否存在不验证或检查文档并尽力缩进的“哑”缩进机制？这些文件的大小为 4-10Mb，并且它们是自动生成的，我们必须在内部处理它 - 没关系，用户可以等待，我只是想避免分叉到新进程等。

这是我的参考代码

        using (MemoryStream ms = new MemoryStream())
        using (XmlTextWriter xtw = new XmlTextWriter(ms, Encoding.Unicode))
        {
            XmlDocument doc = new XmlDocument();
            // LoadSettings the unformatted XML text string into an instance
            // of the XML Document Object Model (DOM)
            doc.LoadXml(content);

            // Set the formatting property of the XML Text Writer to indented
            // the text writer is where the indenting will be performed
            xtw.Formatting = Formatting.Indented;

            // write dom xml to the xmltextwriter
            doc.WriteContentTo(xtw);

            // Flush the contents of the text writer
            // to the memory stream, which is simply a memory file
            xtw.Flush();

            // set to start of the memory stream (file)
            ms.Seek(0, SeekOrigin.Begin);

            // create a reader to read the contents of
            // the memory stream (file)
            using (StreamReader sr = new StreamReader(ms))
                return sr.ReadToEnd();
        }

本质上，现在我使用 MemoryStream、XmlTextWriter 和 XmlDocument，一旦缩进，我就会从 MemoryStream 读回它并将其作为字符串返回。 XHTML 文档和某些 HTML 4 文档会发生失败，因为它试图获取 dtd。我尝试将 XmlResolver 设置为 null 但无济于事:(

原文

I'm trying to find a way of indenting a HTML file, I've been using XMLDocument and just using a XmlTextWriter.

However I am unable to format it correctly for HTML documents because it checks the doctype and tries to download it.

Is there a "dumb" indenting mechanism that doesnt validate or check the document and does a best effort indentation? The files are 4-10Mb in size and they are autogenerated, we have to handle it internal - its fine, the user can wait, I just want to avoid forking to a new process etc.

Here's my code for reference

        using (MemoryStream ms = new MemoryStream())
        using (XmlTextWriter xtw = new XmlTextWriter(ms, Encoding.Unicode))
        {
            XmlDocument doc = new XmlDocument();
            // LoadSettings the unformatted XML text string into an instance
            // of the XML Document Object Model (DOM)
            doc.LoadXml(content);

            // Set the formatting property of the XML Text Writer to indented
            // the text writer is where the indenting will be performed
            xtw.Formatting = Formatting.Indented;

            // write dom xml to the xmltextwriter
            doc.WriteContentTo(xtw);

            // Flush the contents of the text writer
            // to the memory stream, which is simply a memory file
            xtw.Flush();

            // set to start of the memory stream (file)
            ms.Seek(0, SeekOrigin.Begin);

            // create a reader to read the contents of
            // the memory stream (file)
            using (StreamReader sr = new StreamReader(ms))
                return sr.ReadToEnd();
        }

Essentially, right now I use a MemoryStream, XmlTextWriter and XmlDocument, once indented I read it back from the MemoryStream and return it as a string. Failures happen for XHTML documents and some HTML 4 documents because its trying to grab the dtds. I tried setting XmlResolver as null but to no avail :(

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

画▽骨i 2024-09-06 13:25:35

如果无法访问导致问题的特定 X[H]TML，则很难知道这是否有效，但您是否尝试过使用 XDocument 来代替？

XDocument xdoc = XDocument.Parse(xml);
string formatted = xdoc.ToString();

Without access to the specific X[H]TML causing the problems, it's hard to know if this will work, but have you tried using XDocument instead?

XDocument xdoc = XDocument.Parse(xml);
string formatted = xdoc.ToString();

回复收藏 0 原文

~没有更多了~

关于作者

心安伴我暖

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

避免 XmlDocument 在 C# 中验证命名空间

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

西西弗的石头怪

5397313

烟沫凡尘

一个破名字

萌︼了一个春

当爱已成负担

友情链接

避免 XmlDocument 在 C# 中验证命名空间

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

西西弗的石头怪

5397313

烟沫凡尘

一个破名字

萌︼了一个春

当爱已成负担

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。