HTML Agility Pack 空引用

发布于 2024-11-03 07:32:57 字数 659 浏览 6 评论 0原文

我在使用 HTML Agility Pack 时遇到了一些问题。

当我在不包含特定节点的 HTML 上使用此方法时,出现空引用异常。一开始它起作用了,但后来就不起作用了。这只是一个片段,还有大约 10 个 foreach 循环来选择不同的节点。

我做错了什么?

public string Export(string html)
{
    var doc = new HtmlDocument();
    doc.LoadHtml(html);
    // exception gets thrown on below line
    foreach (var repeater in doc.DocumentNode.SelectNodes("//table[@class='mceRepeater']"))
    {
        if (repeater != null)
        {
            repeater.Name = "editor:repeater";
            repeater.Attributes.RemoveAll();
        }
    }

    var sw = new StringWriter();
    doc.Save(sw);
    sw.Flush();

    return sw.ToString();
}

I've got some trouble with the HTML Agility Pack.

I get a null reference exception when I use this method on HTML not containing the specific node. It worked at first, but then it stopped working. This is only a snippet and there are about 10 more foreach loops that selects different nodes.

What am I doing wrong?

public string Export(string html)
{
    var doc = new HtmlDocument();
    doc.LoadHtml(html);
    // exception gets thrown on below line
    foreach (var repeater in doc.DocumentNode.SelectNodes("//table[@class='mceRepeater']"))
    {
        if (repeater != null)
        {
            repeater.Name = "editor:repeater";
            repeater.Attributes.RemoveAll();
        }
    }

    var sw = new StringWriter();
    doc.Save(sw);
    sw.Flush();

    return sw.ToString();
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

网白 2024-11-10 07:32:57

AFAIK,如果没有找到节点,DocumentNode.SelectNodes 可能会返回 null

这是默认行为,请参阅 codeplex 上的讨论线程: Why DocumentNode.SelectNodes returns null

所以解决方法可能是重写 foreach 块:

var repeaters = doc.DocumentNode.SelectNodes("//table[@class='mceRepeater']");
if (repeaters != null)
{
    foreach (var repeater in repeaters)
    {
        if (repeater != null)
        {
            repeater.Name = "editor:repeater";
            repeater.Attributes.RemoveAll();
        }
    }
}

AFAIK, DocumentNode.SelectNodes could return null if no nodes found.

This is default behaviour, see a discussion thread on codeplex: Why DocumentNode.SelectNodes returns null

So the workaround could be in rewriting the foreach block:

var repeaters = doc.DocumentNode.SelectNodes("//table[@class='mceRepeater']");
if (repeaters != null)
{
    foreach (var repeater in repeaters)
    {
        if (repeater != null)
        {
            repeater.Name = "editor:repeater";
            repeater.Attributes.RemoveAll();
        }
    }
}
深海蓝天 2024-11-10 07:32:57

此内容已更新,您现在可以通过设置 doc.OptionEmptyCollection = true 来防止 SelectNodes 返回 null,如 此 github 问题

如果没有与查询匹配的节点,这将使它返回一个空集合而不是 null(不过,我不确定为什么这不是默认行为)

This has been updated, and you can now prevent SelectNodes from returning null by setting doc.OptionEmptyCollection = true, as detailed in this github issue.

This will make it return an empty collection instead of null if there are no nodes which match the query (I'm not sure why this wasn't the default behaviour to begin with, though)

魔法少女 2024-11-10 07:32:57

根据亚历克斯的回答,但我是这样解决的:

public static class HtmlAgilityPackExtensions
{
    public static HtmlAgilityPack.HtmlNodeCollection SafeSelectNodes(this HtmlAgilityPack.HtmlNode node, string selector)
    {
        return (node.SelectNodes(selector) ?? new HtmlAgilityPack.HtmlNodeCollection(node));
    }
}

As per Alex's answer, but I solved it like this:

public static class HtmlAgilityPackExtensions
{
    public static HtmlAgilityPack.HtmlNodeCollection SafeSelectNodes(this HtmlAgilityPack.HtmlNode node, string selector)
    {
        return (node.SelectNodes(selector) ?? new HtmlAgilityPack.HtmlNodeCollection(node));
    }
}
尴尬癌患者 2024-11-10 07:32:57

在每个 . 之前添加简单的 ? 示例如下:

var titleTag = htdoc?.DocumentNode?.Descendants("title")?.FirstOrDefault()?.InnerText;

You add simple ? before every . example are given blow:

var titleTag = htdoc?.DocumentNode?.Descendants("title")?.FirstOrDefault()?.InnerText;
执手闯天涯 2024-11-10 07:32:57

我创建了通用扩展,它可以与任何 IEnumerable 一起使用

public static List<TSource> ToListOrEmpty<TSource>(this IEnumerable<TSource> source)
{
    return source == null ? new List<TSource>() : source.ToList();
}

,用法是:

var opnodes = bodyNode.Descendants("o:p").ToListOrEmpty();
opnodes.ForEach(x => x.Remove());

I've created universal extension which would work with any IEnumerable<T>

public static List<TSource> ToListOrEmpty<TSource>(this IEnumerable<TSource> source)
{
    return source == null ? new List<TSource>() : source.ToList();
}

And usage is:

var opnodes = bodyNode.Descendants("o:p").ToListOrEmpty();
opnodes.ForEach(x => x.Remove());
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文