方法挂在 for 循环上并且不会继续

发布于 2024-12-17 10:05:46 字数 1922 浏览 1 评论 0原文

我正在使用 html 敏捷包来解析我加载的几个文本文件。然后,我将解析出的数据保存到字符串列表中以供进一步处理。但是,当我使用此方法时,它永远不会触发该行:

MessageBox.Show("test");

此外,如果我在该方法后面包含任何其他代码,则不会触发任何代码。

有人对我的错误有任何建议吗?

整个方法如下:

private void ParseOutput()
{
    nodeDupList = new List<string>();
    StreamWriter OurStream;
    OurStream = File.CreateText(dir + @"\CombinedPages.txt");
    OurStream.Close();
    for (int crawl = 1; crawl <= crawlPages.Length; crawl++)
    {
        var web = new HtmlWeb();
        var doc = web.Load(dir + @"\Pages" + crawl.ToString() + ".txt");

        var nodeCount = doc.DocumentNode.SelectNodes(@"/html[1]/body[1]/div[1]/table[3]/tbody[1]/tr[td/@class=""style_23""]");
        int nCount = nodeCount.Count;
        for (int a = 3; a <= nCount; a++)
        {
            var specContent = doc.DocumentNode.SelectNodes(@"/html[1]/body[1]/div[1]/table[3]/tbody[1]/tr[" + a + @"]/td[3]/div[contains(@class,'style_24')]"); 
            foreach (HtmlNode node in specContent)
            {
                nodeDupList.Add(node.InnerText + ".d");
            }
        }
    }
    MessageBox.Show("test");
}

我创建了一个爬虫,将多个 html 页面保存为文本,并使用此方法单独解析它们。 我只是使用 MessageBox 来表明它不会继续遵循“for 循环”。我在解决方案中调用了多个方法,但它不会迭代它们。

该应用程序是一个针对 .Net Framework 4 的 Win Forms 应用程序。

编辑: 感谢您的帮助。

它有时会在循环中崩溃

    for (int a = 3; a <= nCount; a++) 
    { 
        var specContent = doc.DocumentNode.SelectNodes(@"/html[1]/body[1]/div[1]/table[3]/tbody[1]/tr[" + a + @"]/td[3]/div[contains(@class,'style_24')]");  
        foreach (HtmlNode node in specContent) 
        { 
            nodeDupList.Add(node.InnerText + ".d"); 
        } 
    } 

通过调试器重新运行它后,我意识到当 var specContent 为空时,

。没有产生异常;方法刚刚结束。

由于该网站是动态的,我正在爬行它很少返回 null,但在几个实例中它确实返回了 null,并且发生了这种情况。

I'm using html agility pack to parse several text files that I load. I then save the data that I parse out into a string list for further processing. However, when I use this method, it never hits the line:

MessageBox.Show("test");

Additionally, if I include any other code following this method, none of it is triggered.

Does anyone have any suggestions as to my error?

The entire method is included below:

private void ParseOutput()
{
    nodeDupList = new List<string>();
    StreamWriter OurStream;
    OurStream = File.CreateText(dir + @"\CombinedPages.txt");
    OurStream.Close();
    for (int crawl = 1; crawl <= crawlPages.Length; crawl++)
    {
        var web = new HtmlWeb();
        var doc = web.Load(dir + @"\Pages" + crawl.ToString() + ".txt");

        var nodeCount = doc.DocumentNode.SelectNodes(@"/html[1]/body[1]/div[1]/table[3]/tbody[1]/tr[td/@class=""style_23""]");
        int nCount = nodeCount.Count;
        for (int a = 3; a <= nCount; a++)
        {
            var specContent = doc.DocumentNode.SelectNodes(@"/html[1]/body[1]/div[1]/table[3]/tbody[1]/tr[" + a + @"]/td[3]/div[contains(@class,'style_24')]"); 
            foreach (HtmlNode node in specContent)
            {
                nodeDupList.Add(node.InnerText + ".d");
            }
        }
    }
    MessageBox.Show("test");
}

I've created a crawler to save multiple html pages to text and parse them separately using this method.
I'm just using MessageBox to show that it won't continue following the "for loop". I've called multiple methods in my solution and it won't iterate through them.

The application is a Win Forms Application targeted at .Net Framework 4.

Edit:
Thanks for the help.

I realized after rerunning it through the debugger that it was crashing at times on the loop

    for (int a = 3; a <= nCount; a++) 
    { 
        var specContent = doc.DocumentNode.SelectNodes(@"/html[1]/body[1]/div[1]/table[3]/tbody[1]/tr[" + a + @"]/td[3]/div[contains(@class,'style_24')]");  
        foreach (HtmlNode node in specContent) 
        { 
            nodeDupList.Add(node.InnerText + ".d"); 
        } 
    } 

when the var specContent was null.

There was no exception generated; the method just ended.

As the website is dynamic that I was crawling it rarely returned null but on several instances it had and this happened.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

从﹋此江山别 2024-12-24 10:05:46

对于任何可能需要这个的人来说,解决方案是检查

for (int a = 3; a <= nCount; a++)  
{  
    var specContent = doc.DocumentNode.SelectNodes(@"/html[1]/body[1]/div[1]/table[3]/tbody[1]/tr[" + a + @"]/td[3]/div[contains(@class,'style_24')]");
    if(specContent !=null) //added this check for null
    {
        foreach (HtmlNode node in specContent)  
        {  
            nodeDupList.Add(node.InnerText + ".d");  
        }
    }  
}

我是否也可以使用 try{} catch{} 块来输出错误(如果需要)

The solution, for anyone who might need this is to check if

for (int a = 3; a <= nCount; a++)  
{  
    var specContent = doc.DocumentNode.SelectNodes(@"/html[1]/body[1]/div[1]/table[3]/tbody[1]/tr[" + a + @"]/td[3]/div[contains(@class,'style_24')]");
    if(specContent !=null) //added this check for null
    {
        foreach (HtmlNode node in specContent)  
        {  
            nodeDupList.Add(node.InnerText + ".d");  
        }
    }  
}

I also could have used a try{} catch{} block to output the error if needed

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文