HTML Agility Pack RemoveChild - 行为不符合预期

发布于 2024-12-12 06:47:52 字数 655 浏览 1 评论 0原文

假设我想从此 html 中删除 span 标签:

<html><span>we do like <b>bold</b> stuff</span></html>

我希望这段代码能够完成我想要的操作,

string html = "<html><span>we do like <b>bold</b> stuff</span></html>";
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);

HtmlNode span = doc.DocumentNode.Descendants("span").First();
span.ParentNode.RemoveChild(span, true); //second parameter is 'keepGrandChildren'

但输出如下所示:

<html> stuff<b>bold</b>we do like </html>

它似乎正在反转跨度内的子节点。我做错了什么吗?

Say I want to remove the span tag from this html:

<html><span>we do like <b>bold</b> stuff</span></html>

I'm expecting this chunk of code to do what I'm after

string html = "<html><span>we do like <b>bold</b> stuff</span></html>";
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);

HtmlNode span = doc.DocumentNode.Descendants("span").First();
span.ParentNode.RemoveChild(span, true); //second parameter is 'keepGrandChildren'

But the output looks like this:

<html> stuff<b>bold</b>we do like </html>

It appears to be reversing the child nodes within the span. Am I doing something wrong?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

迷乱花海 2024-12-19 06:47:53
foreach (HtmlNode child in tag.ChildNodes)
{
    tag.ParentNode.InsertBefore(child, tag);
}

tag.Remove();
foreach (HtmlNode child in tag.ChildNodes)
{
    tag.ParentNode.InsertBefore(child, tag);
}

tag.Remove();
谁把谁当真 2024-12-19 06:47:53

仅供记录,这是我的版本,基于此问题的答案:

using HtmlAgilityPack;

internal static class HtmlAgilityPackExtensions
{
    public static void RemoveNodeKeepChildren(this HtmlNode node)
    {
        foreach (var child in node.ChildNodes)
        {
            node.ParentNode.InsertBefore(child, node);
        }
        node.Remove();
    }
}

Just for the records, this is my version, based on the answers of this question:

using HtmlAgilityPack;

internal static class HtmlAgilityPackExtensions
{
    public static void RemoveNodeKeepChildren(this HtmlNode node)
    {
        foreach (var child in node.ChildNodes)
        {
            node.ParentNode.InsertBefore(child, node);
        }
        node.Remove();
    }
}
窝囊感情。 2024-12-19 06:47:52

看起来像 HtmlAgilityPack 中的错误 - 请参阅他们的问题寄存器:

http://htmlagilitypack.codeplex.com/workitem/9113

有趣的是,这是 4 年前提出的...

这是一个片段,它将删除所有 span 标签(或您指定的任何其他标签)并保持其他节点的正确顺序。

void Main()
{
    string html = "<html><span>we do like <b>bold</b> stuff</span></html>";
    HtmlDocument doc = new HtmlDocument();
    doc.LoadHtml(html);
    RemoveTags(doc, "span");
    Console.WriteLine(doc.DocumentNode.OuterHtml);
}

public static void RemoveTags(HtmlDocument html, string tagName)
{
    var tags = html.DocumentNode.SelectNodes("//" + tagName);
    if (tags!=null)
    {
        foreach (var tag in tags)
        {
            if (!tag.HasChildNodes)
            {
                tag.ParentNode.RemoveChild(tag);
                continue;
            }

            for (var i = tag.ChildNodes.Count - 1; i >= 0; i--)
            {
                var child = tag.ChildNodes[i];
                tag.ParentNode.InsertAfter(child, tag);
            }
            tag.ParentNode.RemoveChild(tag);
        }
    }
}

Looks like a bug in HtmlAgilityPack - see their issue register:

http://htmlagilitypack.codeplex.com/workitem/9113

Interestingly this was raised 4 years ago...

Here's a snippet that will remove all span tags (or any other tag you specify) and keeps other nodes in the correct order.

void Main()
{
    string html = "<html><span>we do like <b>bold</b> stuff</span></html>";
    HtmlDocument doc = new HtmlDocument();
    doc.LoadHtml(html);
    RemoveTags(doc, "span");
    Console.WriteLine(doc.DocumentNode.OuterHtml);
}

public static void RemoveTags(HtmlDocument html, string tagName)
{
    var tags = html.DocumentNode.SelectNodes("//" + tagName);
    if (tags!=null)
    {
        foreach (var tag in tags)
        {
            if (!tag.HasChildNodes)
            {
                tag.ParentNode.RemoveChild(tag);
                continue;
            }

            for (var i = tag.ChildNodes.Count - 1; i >= 0; i--)
            {
                var child = tag.ChildNodes[i];
                tag.ParentNode.InsertAfter(child, tag);
            }
            tag.ParentNode.RemoveChild(tag);
        }
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文