使用 HtmlAgilityPack 从 WP7 上的 HTML 获取文本

发布于 2024-12-20 19:31:18 字数 1116 浏览 1 评论 0原文

我正在尝试使用 HtmlAgilityPack 从 HTML 中提取文本。我成功地将 HtmlAgilityPack 添加到我的项目中。但是,我尝试使用以下代码来提取正文:

HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();

// There are various options, set as needed
htmlDoc.OptionFixNestedTags=true;

// filePath is a path to a file containing the html
htmlDoc.Load(filePath);

// Use:  htmlDoc.LoadXML(xmlString);  to load from a string

// ParseErrors is an ArrayList containing any errors from the Load statement
if (htmlDoc.ParseErrors!=null && htmlDoc.ParseErrors.Count>0)
{
    // Handle any parse errors as required
}
else
{
    if (htmlDoc.DocumentNode != null)
    {
        HtmlNode bodyNode = htmlDoc.DocumentNode.SelectSingleNode("//body");

        if (bodyNode != null)
        {
            // Do something with bodyNode
        }
    }
}

并且在构建项目时收到以下错误。

错误 1 ​​类型“System.Xml.XPath.IXPathNavigable”是在未引用的程序集中定义的。您必须添加对程序集“System.Xml.XPath,Version=2.0.5.0,Culture=neutral,PublicKeyToken=31bf3856ad364e35”的引用。 D:\test\test\MainPage.xaml.cs 58

我应该补充一点,我添加了 System.Xml 引用,但仍然收到此错误。你能帮我解决这个问题吗?谢谢。

I'm trying to extract text from HTML using HtmlAgilityPack. I successfully added HtmlAgilityPack to my project. However, I tried the following code to extract the body text:

HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();

// There are various options, set as needed
htmlDoc.OptionFixNestedTags=true;

// filePath is a path to a file containing the html
htmlDoc.Load(filePath);

// Use:  htmlDoc.LoadXML(xmlString);  to load from a string

// ParseErrors is an ArrayList containing any errors from the Load statement
if (htmlDoc.ParseErrors!=null && htmlDoc.ParseErrors.Count>0)
{
    // Handle any parse errors as required
}
else
{
    if (htmlDoc.DocumentNode != null)
    {
        HtmlNode bodyNode = htmlDoc.DocumentNode.SelectSingleNode("//body");

        if (bodyNode != null)
        {
            // Do something with bodyNode
        }
    }
}

and I receive the following error when building the project.

Error 1 The type 'System.Xml.XPath.IXPathNavigable' is defined in an assembly that is not referenced. You must add a reference to assembly 'System.Xml.XPath, Version=2.0.5.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35'. D:\test\test\MainPage.xaml.cs 58

I should add that I added the System.Xml reference and I still get this error. Can you please help me out what this issue? Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

メ斷腸人バ 2024-12-27 19:31:18

谢谢。我发现我必须添加对 Microsoft SDK 父文件夹中可用的 Silverlight 4.0 文件夹中的 System.Xml.XPath 的引用。

Thanks. I figured out that I had to add a reference to the System.Xml.XPath from the Silverlight 4.0 folder available in the Microsoft SDKs parent folder.

感性不性感 2024-12-27 19:31:18

使用手机上的 HAP,您必须使用 Linq2Xml 在解析的 HTML 中查找内容。您可能必须从源代码(HAPPhone)构建电话版本。

public void Hap()
{
   HtmlWeb.LoadAsync("http://www.page.com", OnCallback);              
}



private void OnCallback(object s, HtmlDocumentLoadCompleted htmlDocumentLoadCompleted)
        {            
            var htmlDocument = htmlDocumentLoadCompleted.Document;

            var test = htmlDocument.DocumentNode.Descendants("select").ToList();


            var test2 = (from h in htmlDocument.DocumentNode.Descendants("select")
                         where h.Attributes["id"].Value == "stateDropdown"
                         select h).FirstOrDefault().ChildNodes.ToList();
        }

With HAP on the phone you'll have to use Linq2Xml to find stuff in the parsed HTML. And you might have to build the phone version from the source (HAPPhone).

public void Hap()
{
   HtmlWeb.LoadAsync("http://www.page.com", OnCallback);              
}



private void OnCallback(object s, HtmlDocumentLoadCompleted htmlDocumentLoadCompleted)
        {            
            var htmlDocument = htmlDocumentLoadCompleted.Document;

            var test = htmlDocument.DocumentNode.Descendants("select").ToList();


            var test2 = (from h in htmlDocument.DocumentNode.Descendants("select")
                         where h.Attributes["id"].Value == "stateDropdown"
                         select h).FirstOrDefault().ChildNodes.ToList();
        }
差↓一点笑了 2024-12-27 19:31:18

它表示您需要添加对 System.Xml.XPath 的引用,而不是 System.Xml。

It says you need to add a reference to System.Xml.XPath not System.Xml.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文