HtmlAgilityPack:如何创建缩进的 HTML?
因此,我使用 HtmlAgilityPack 生成 html,它工作正常,但 html 文本没有缩进。不过,我可以获得缩进的 XML,但我需要 HTML。有办法吗?
HtmlDocument doc = new HtmlDocument();
// gen html
HtmlNode table = doc.CreateElement("table");
table.Attributes.Add("class", "tableClass");
HtmlNode tr = doc.CreateElement("tr");
table.ChildNodes.Append(tr);
HtmlNode td = doc.CreateElement("td");
td.InnerHtml = "—";
tr.ChildNodes.Append(td);
// write text, no indent :(
using(StreamWriter sw = new StreamWriter("table.html"))
{
table.WriteTo(sw);
}
// write xml, nicely indented but it's XML!
XmlWriterSettings settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
settings.Indent = true;
settings.ConformanceLevel = ConformanceLevel.Fragment;
using (XmlWriter xw = XmlTextWriter.Create("table.xml", settings))
{
table.WriteTo(xw);
}
So, I am generating html using HtmlAgilityPack and it's working perfectly, but html text is not indented. I can get indented XML however, but I need HTML. Is there a way?
HtmlDocument doc = new HtmlDocument();
// gen html
HtmlNode table = doc.CreateElement("table");
table.Attributes.Add("class", "tableClass");
HtmlNode tr = doc.CreateElement("tr");
table.ChildNodes.Append(tr);
HtmlNode td = doc.CreateElement("td");
td.InnerHtml = "—";
tr.ChildNodes.Append(td);
// write text, no indent :(
using(StreamWriter sw = new StreamWriter("table.html"))
{
table.WriteTo(sw);
}
// write xml, nicely indented but it's XML!
XmlWriterSettings settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
settings.Indent = true;
settings.ConformanceLevel = ConformanceLevel.Fragment;
using (XmlWriter xw = XmlTextWriter.Create("table.xml", settings))
{
table.WriteTo(xw);
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
快速、可靠、纯 C#、.NET Core 兼容 AngleSharp
您可以使用 AngleSharp 解析它
它提供了一种自动缩进的方法:
Fast, Reliable, Pure C#, .NET Core compatible AngleSharp
You can parse it with AngleSharp
which provides a way to auto indent:
不,这是一个“设计使然”的选择。 XML(或 XHTML,是 XML,而不是 HTML)和 HTML 之间有很大的区别,大多数情况下,空格没有特定的含义。
这不是一个很小的改进,因为更改空格可以改变某些浏览器呈现给定 HTML 块的方式,尤其是格式错误的 HTML(通常由库很好地处理)。 Html Agility Pack 的设计目的是保持 HTML 呈现的方式,而不是最小化标记编写的方式。
我并不是说这是不可行或根本不可能的。显然,您可以转换为 XML 并瞧(您可以编写一个扩展方法来简化此操作),但在一般情况下,呈现的输出可能会有所不同。
No, and it's a "by design" choice. There is a big difference between XML (or XHTML, which is XML, not HTML) where - most of the times - whitespaces are no specific meaning, and HTML.
This is not a so minor improvement, as changing whitespaces can change the way some browsers render a given HTML chunk, especially malformed HTML (that is in general well handled by the library). And the Html Agility Pack was designed to keep the way the HTML is rendered, not to minimize the way the markup is written.
I'm not saying it's not feasible or plain impossible. Obviously you can convert to XML and voilà (and you could write an extension method to make this easier) but the rendered output may be different, in the general case.
据我所知,HtmlAgilityPack 无法做到这一点。但是您可以查看类似问题中提出的 html tidy packs:
clean
HTML 敏捷包中有任何选项
使 HTML 网页整洁?
As far as I know, HtmlAgilityPack cannot do this. But you could look through html tidy packs which are proposed in similar questions:
neat
there any option in HTML agility pack
to make HTML webpage tidy?
我也有同样的经历,尽管 HtmlAgilityPack 非常适合读取和修改 Html(或在我的情况下为 asp)文件,但您无法创建可读输出。
然而,我最终编写了一些对我有用的代码行:
有一个名为“m_htmlDocument”的 HtmlDocument,我创建了 HTML 文件,如下所示
:
I made the same experience even though HtmlAgilityPack is great to read and modify Html (or in my case asp) files you cannot create readable output.
However, I ended up in writing some lines of code which work for me:
Having a HtmlDocument named "m_htmlDocument" I create my HTML file as follows:
and