HTML 解析没有结果

发布于 2024-12-02 16:52:44 字数 1495 浏览 2 评论 0原文

我正在尝试解析此 HTML 文档以获取航班、时间、出发地、日期和输出的内容。

<div id="FlightInfo_FlightInfoUpdatePanel">

<table cellspacing="0" cellpadding="0">
<tbody>
    <tr class="">
    <td class="airline"><img src="/images/airline logos/US.gif" title="US AIRWAYS. " alt="US AIRWAYS. " /></td>
    <td class="flight">US5316</td>
    <td class="codeshare">NZ46</td>
    <td class="origin">Rarotonga</td>
    <td class="date">02 Sep</td>
    <td class="time">10:30</td>
    <td class="est">21:30</td>
    <td class="status">CHECK IN CLOSING</td>
    </tr>

我正在使用此代码，基于 Windows Phone 7 的 HTML Agility Pack 来查找并输出 US5316 的内容，

void client_DownloadStringCompleted(object sender, DownloadStringCompletedEventArgs e)
{
    var html = e.Result;

    var doc = new HtmlDocument();
    doc.LoadHtml(html);


    var node = doc.DocumentNode.Descendants("div")
        .FirstOrDefault(x => x.Id == "FlightInfo_FlightInfoUpdatePanel")
        .Element("table")
        .Element("tbody")
        .Elements("tr")
        .Where(tr => tr.GetAttributeValue("td", "").Contains("class"))
        .SelectMany(tr => tr.Descendants("flight"))
        .ToArray();

    this.scrollViewer1.Content = node;  

   //Added below

   listBox1.itemSource = node;
}

我在这两个方面都没有得到结果ScrollViewer 或列表框。我想知道我使用的 linq 解析对于我提供的 HTML 是否正确？

原文

Am trying to parse this HTML document to get the contents of flight, time, origin, date and output.

<div id="FlightInfo_FlightInfoUpdatePanel">

<table cellspacing="0" cellpadding="0">
<tbody>
    <tr class="">
    <td class="airline"><img src="/images/airline logos/US.gif" title="US AIRWAYS. " alt="US AIRWAYS. " /></td>
    <td class="flight">US5316</td>
    <td class="codeshare">NZ46</td>
    <td class="origin">Rarotonga</td>
    <td class="date">02 Sep</td>
    <td class="time">10:30</td>
    <td class="est">21:30</td>
    <td class="status">CHECK IN CLOSING</td>
    </tr>

I am using this code, based on HTML Agility Pack for windows phone 7 to find and output the content of <td class="flight">US5316</td>

void client_DownloadStringCompleted(object sender, DownloadStringCompletedEventArgs e)
{
    var html = e.Result;

    var doc = new HtmlDocument();
    doc.LoadHtml(html);


    var node = doc.DocumentNode.Descendants("div")
        .FirstOrDefault(x => x.Id == "FlightInfo_FlightInfoUpdatePanel")
        .Element("table")
        .Element("tbody")
        .Elements("tr")
        .Where(tr => tr.GetAttributeValue("td", "").Contains("class"))
        .SelectMany(tr => tr.Descendants("flight"))
        .ToArray();

    this.scrollViewer1.Content = node;  

   //Added below

   listBox1.itemSource = node;
}

I get no results in either the ScrollViewer or the Listbox. I would like to know if the linq parse that I am using is correct for the HTML I supplied?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

审判长 2024-12-09 16:52:44

你打算用这条线做什么？

.Where(tr => tr.GetAttributeValue("td", "").Contains("class"))

GetAttributeValue(name, def) 在节点中查找具有键 name 的属性，并在找到该属性时返回该属性的值。否则，它返回默认值def。

所以这里实际发生的是没有任何带有键 td 的属性，所以它返回默认值（一个空字符串），这确实不包含子字符串“class”，因此您的节点被过滤掉。

编辑：
这将返回一个数组，其中每个条目都是包含每个 td 内容的 8 个字符串的数组：

var node = doc.DocumentNode.Descendants("div")
    .FirstOrDefault(x => x.Id == "FlightInfo_FlightInfoUpdatePanel")
    .Element("table")
    .Element("tbody")
    .Elements("tr")
    .Select(tr => tr.Elements("td").Select(td => td.InnerText).ToArray())
    .ToArray();

示例：

node[0][1] == "US5316"
node[0][3] == "Rarotonga"

What do you intend to do with this line?

.Where(tr => tr.GetAttributeValue("td", "").Contains("class"))

GetAttributeValue(name, def) looks for an attribute with the key name in the node, and it returns the value of that attribute in case it founds it. Otherwise, it returns the default value def.

So what's actually happening here is that <tr> doesn't have any attribute with the key td, so it's returning the default value (an empty string), which does not contain the substring "class", so your <tr> node is being filtered out.

Edit:
This will return an array where each entry is an array of 8 strings containing the contents of each td:

var node = doc.DocumentNode.Descendants("div")
    .FirstOrDefault(x => x.Id == "FlightInfo_FlightInfoUpdatePanel")
    .Element("table")
    .Element("tbody")
    .Elements("tr")
    .Select(tr => tr.Elements("td").Select(td => td.InnerText).ToArray())
    .ToArray();

Examples:

node[0][1] == "US5316"
node[0][3] == "Rarotonga"

回复收藏 0 原文

岁月无声 2024-12-09 16:52:44

您尝试将 ScrollViewer 的内容设置为 string[] （数组）。因此，我会重复一遍，并说您应该花一些时间学习基本的 C#，然后再继续这一努力。

您需要做的是使用 ListBox 而不是 ScrollViewer ，然后将 ListBox.ItemSource 设置为您的 node< /code> 字符串数组。

回复收藏 0 原文

~没有更多了~

关于作者

紧拥背影

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

HTML 解析没有结果

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

Promise

qq_lbRlsh

待＂谢繁草

yy2010hell

漫无边际

傲娇萝莉攻

友情链接

HTML 解析没有结果

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

Promise

qq_lbRlsh

待＂谢繁草

yy2010hell

漫无边际

傲娇萝莉攻

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。