Html 敏捷包帮助

发布于 2024-09-27 15:49:43 字数 707 浏览 5 评论 0原文

我正在尝试从网站上抓取一些信息,但找不到适合我的解决方案。我在互联网上读到的每一个代码都会至少产生一个错误。

即使他们主页上的示例代码也会给我带来错误。

我的代码:

         HtmlDocument doc = new HtmlDocument();
         doc.Load("https://www.flashback.org/u479804");
         foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
         {
            HtmlAttribute att = link["href"];
            att.Value = FixLink(att);
         }
         doc.Save("file.htm");

生成以下错误:

“HtmlDocument”是“System.Windows.Forms.HtmlDocument”和“HtmlAgilityPack.HtmlDocument”之间不明确的引用 C:*\Form1.cs

编辑:我的整个代码位于此处: http://beta.yapaste.com/55

非常感谢所有帮助!

I'm trying to scrape some information from a website but can't find a solution that works for me. Every code I read on the Internet generates at least one error for me.

Even the example code at their homepage generates errors for me.

My code:

         HtmlDocument doc = new HtmlDocument();
         doc.Load("https://www.flashback.org/u479804");
         foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
         {
            HtmlAttribute att = link["href"];
            att.Value = FixLink(att);
         }
         doc.Save("file.htm");

Generates the following error:

'HtmlDocument' is an ambiguous reference between 'System.Windows.Forms.HtmlDocument' and 'HtmlAgilityPack.HtmlDocument' C:*\Form1.cs

Edit: My entire code is located here: http://beta.yapaste.com/55

All help is very appreciated!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

腹黑女流氓 2024-10-04 15:49:43

使用 HtmlAgilityPack.HtmlDocument

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

编译器会感到困惑,因为使用 using 导入的两个命名空间包含名为 HtmlDocument 的类 - HTML Agility Pack命名空间和 Windows 窗体命名空间。您可以通过明确指定要使用哪个类来解决这个问题。

Use HtmlAgilityPack.HtmlDocument:

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

The compiler is getting confused because two of the namespaces you have imported with using contain classes called HtmlDocument - the HTML Agility Pack namespace, and the Windows Forms namespace. You can get around this by specifying which class you want to use explicitly.

落叶缤纷 2024-10-04 15:49:43

这就是我取得的成就。请注意,foreach 行文档中的 main Html Agility Pack 示例 中给出了代码错误.DocumentElement.SelectNodes("//a[@href"]).下面给出了正确且经过测试的一个。

 HtmlWeb hw = new HtmlWeb();

    HtmlDocument doc = hw.Load(@"http://adityabajaj.com");
    StringBuilder sb = new StringBuilder();

    List<string> lstHref = new List<string>();

    foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a[@href]").Distinct())
    {
        string curHref = link.Attributes["href"].Value;

        if(!lstHref.Contains(curHref))
        lstHref.Add(curHref);

    }
    foreach (string str in lstHref)
    {
        sb.Append(str +"<br />");
    }

    Response.Write (sb.ToString());

既然它对我有用,我想我应该分享。

this is how i achieved. Note that there is a code error given in main Html Agility Pack Example in foreach line doc.DocumentElement.SelectNodes("//a[@href"]). The correct and tested one is given below.

 HtmlWeb hw = new HtmlWeb();

    HtmlDocument doc = hw.Load(@"http://adityabajaj.com");
    StringBuilder sb = new StringBuilder();

    List<string> lstHref = new List<string>();

    foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a[@href]").Distinct())
    {
        string curHref = link.Attributes["href"].Value;

        if(!lstHref.Contains(curHref))
        lstHref.Add(curHref);

    }
    foreach (string str in lstHref)
    {
        sb.Append(str +"<br />");
    }

    Response.Write (sb.ToString());

Since it got working for me, I thought I should share.

携君以终年 2024-10-04 15:49:43

System.Windows.FormsHtmlAgilityPack 两个命名空间中的类存在冲突。使用完全限定的类型名称或使用命名空间别名。

The classes in the two namespaces System.Windows.Forms and HtmlAgilityPack are conflicting. Use fully-qualified type names or use namespace aliases.

べ繥欢鉨o。 2024-10-04 15:49:43

我写了几篇文章来解释如何使用 HtmlAgilityPack。您可能会发现它们对入门很有用:

警告 (2012-06-08):此链接有点垃圾邮件 - 狡猾的弹出式广告,内容不多。

我不知道如果他们现在已经修复了它,但该代码片段在网站的主页上不起作用,我认为它来自该库的早期版本。此外,该代码片段没有定义 FixLink(),因此即使它对于库来说是正确的,它也无法工作。

我建议获取该库的最新 beta 版本,因为它具有用于对其执行 linq 查询的额外扩展,这可以让您免于稍后混淆 xpath 查询。

我以前没有见过它在 Windows 窗体应用程序中使用,但看起来您必须使用完全限定的类型名称,例如:

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

至于您尝试执行的实际任务,似乎您想要获取一个 url,将用户名和 ID 注入其中,然后...不确定?您看起来都在尝试将文件保存到磁盘并将 html 代码设置为表单的内容,我认为您无法做到这一点?

I have written a couple of articles that explain how to use HtmlAgilityPack. You might find them useful to get started:

WARNING (2012-06-08): This link is a bit spammy - dodgy pop-under adverts, not much content.

I don't know if they have fixed it now but that snippet didn't used to work on the homepage of the site, I think it was from an earlier version of the library. Also the snippet doesn't define FixLink() so it wouldn't work even if it was correct for the library.

I would recommend getting the latest beta version of the library because it has extra extensions for performing linq queries against it which can save you from confusing xpath queries later on.

I haven't seen it used in a Windows Forms app before but it looks like you will have to use fully-qualified type names like:

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();

As for the actual task you are trying to perform, it seems like you want to take a url, inject a username and id into it and then... not sure? You look like you are both trying to save the file out to disk and set the html code to the contents of a Form which I don't think you can do?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文