HTML 解析没有结果

发布于 2024-12-02 16:52:44 字数 1495 浏览 2 评论 0原文

我正在尝试解析此 HTML 文档以获取航班、时间、出发地、日期和输出的内容。

<div id="FlightInfo_FlightInfoUpdatePanel">

<table cellspacing="0" cellpadding="0">
<tbody>
    <tr class="">
    <td class="airline"><img src="/images/airline logos/US.gif" title="US AIRWAYS. " alt="US AIRWAYS. " /></td>
    <td class="flight">US5316</td>
    <td class="codeshare">NZ46</td>
    <td class="origin">Rarotonga</td>
    <td class="date">02 Sep</td>
    <td class="time">10:30</td>
    <td class="est">21:30</td>
    <td class="status">CHECK IN CLOSING</td>
    </tr>

我正在使用此代码,基于 Windows Phone 7 的 HTML Agility Pack 来查找并输出 US5316 的内容,

void client_DownloadStringCompleted(object sender, DownloadStringCompletedEventArgs e)
{
    var html = e.Result;

    var doc = new HtmlDocument();
    doc.LoadHtml(html);


    var node = doc.DocumentNode.Descendants("div")
        .FirstOrDefault(x => x.Id == "FlightInfo_FlightInfoUpdatePanel")
        .Element("table")
        .Element("tbody")
        .Elements("tr")
        .Where(tr => tr.GetAttributeValue("td", "").Contains("class"))
        .SelectMany(tr => tr.Descendants("flight"))
        .ToArray();

    this.scrollViewer1.Content = node;  

   //Added below

   listBox1.itemSource = node;
}

我在这两个方面都没有得到结果ScrollViewer 或列表框。我想知道我使用的 linq 解析对于我提供的 HTML 是否正确?

Am trying to parse this HTML document to get the contents of flight, time, origin, date and output.

<div id="FlightInfo_FlightInfoUpdatePanel">

<table cellspacing="0" cellpadding="0">
<tbody>
    <tr class="">
    <td class="airline"><img src="/images/airline logos/US.gif" title="US AIRWAYS. " alt="US AIRWAYS. " /></td>
    <td class="flight">US5316</td>
    <td class="codeshare">NZ46</td>
    <td class="origin">Rarotonga</td>
    <td class="date">02 Sep</td>
    <td class="time">10:30</td>
    <td class="est">21:30</td>
    <td class="status">CHECK IN CLOSING</td>
    </tr>

I am using this code, based on HTML Agility Pack for windows phone 7 to find and output the content of <td class="flight">US5316</td>

void client_DownloadStringCompleted(object sender, DownloadStringCompletedEventArgs e)
{
    var html = e.Result;

    var doc = new HtmlDocument();
    doc.LoadHtml(html);


    var node = doc.DocumentNode.Descendants("div")
        .FirstOrDefault(x => x.Id == "FlightInfo_FlightInfoUpdatePanel")
        .Element("table")
        .Element("tbody")
        .Elements("tr")
        .Where(tr => tr.GetAttributeValue("td", "").Contains("class"))
        .SelectMany(tr => tr.Descendants("flight"))
        .ToArray();

    this.scrollViewer1.Content = node;  

   //Added below

   listBox1.itemSource = node;
}

I get no results in either the ScrollViewer or the Listbox. I would like to know if the linq parse that I am using is correct for the HTML I supplied?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

审判长 2024-12-09 16:52:44

你打算用这条线做什么?

.Where(tr => tr.GetAttributeValue("td", "").Contains("class"))

GetAttributeValue(name, def) 在节点中查找具有键 name 的属性,并在找到该属性时返回该属性的值。否则,它返回默认值def

所以这里实际发生的是 没有任何带有键 td 的属性,所以它返回默认值(一个空字符串),这确实不包含子字符串“class”,因此您的 节点被过滤掉。

编辑
这将返回一个数组,其中每个条目都是包含每个 td 内容的 8 个字符串的数组:

var node = doc.DocumentNode.Descendants("div")
    .FirstOrDefault(x => x.Id == "FlightInfo_FlightInfoUpdatePanel")
    .Element("table")
    .Element("tbody")
    .Elements("tr")
    .Select(tr => tr.Elements("td").Select(td => td.InnerText).ToArray())
    .ToArray();

示例:

node[0][1] == "US5316"
node[0][3] == "Rarotonga"

What do you intend to do with this line?

.Where(tr => tr.GetAttributeValue("td", "").Contains("class"))

GetAttributeValue(name, def) looks for an attribute with the key name in the node, and it returns the value of that attribute in case it founds it. Otherwise, it returns the default value def.

So what's actually happening here is that <tr> doesn't have any attribute with the key td, so it's returning the default value (an empty string), which does not contain the substring "class", so your <tr> node is being filtered out.

Edit:
This will return an array where each entry is an array of 8 strings containing the contents of each td:

var node = doc.DocumentNode.Descendants("div")
    .FirstOrDefault(x => x.Id == "FlightInfo_FlightInfoUpdatePanel")
    .Element("table")
    .Element("tbody")
    .Elements("tr")
    .Select(tr => tr.Elements("td").Select(td => td.InnerText).ToArray())
    .ToArray();

Examples:

node[0][1] == "US5316"
node[0][3] == "Rarotonga"
岁月无声 2024-12-09 16:52:44

您尝试将 ScrollViewer 的内容设置为 string[] (数组)。因此,我会重复一遍,并说您应该花一些时间学习基本的 C#,然后再继续这一努力。

您需要做的是使用 ListBox 而不是 ScrollViewer ,然后将 ListBox.ItemSource 设置为您的 node< /code> 字符串数组。

You're trying to set the content of a ScrollViewer to a string[] (an array). So I'll repeat myself, and say that you should take some time to learn basic C# before you continue this endeavour.

What you need to do, is to use a ListBox instead of the ScrollViewer and then set the ListBox.ItemSource to your node string-array.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文