HTML 敏捷包:解析 href 标签

发布于 2024-12-21 07:56:56 字数 1536 浏览 0 评论 0原文

我如何有效地从中解析 href 属性值:

<tr>
<td rowspan="1" colspan="1">7</td>
<td rowspan="1" colspan="1">
<a class="undMe" href="/ice/player.htm?id=8475179" rel="skaterLinkData" shape="rect">D. Kulikov</a>
</td>
<td rowspan="1" colspan="1">D</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">0</td>
[...]

我对玩家 ID 感兴趣,即:8475179 这是我到目前为止的代码:

        // Iterate all rows (players)
        for (int i = 1; i < rows.Count; ++i)
        {
            HtmlNodeCollection cols = rows[i].SelectNodes(".//td");

            // new player
            Dim_Player player = new Dim_Player();

                // Iterate all columns in this row
                for (int j = 1; j < 6; ++j)
                {
                    switch (j) {
                        case 1: player.Name = cols[j].InnerText;
                                player.Player_id = Int32.Parse(/* this is where I want to parse the href value */); 
                                break;
                        case 2: player.Position = cols[j].InnerText; break;
                        case 3: stats.Goals = Int32.Parse(cols[j].InnerText); break;
                        case 4: stats.Assists = Int32.Parse(cols[j].InnerText); break;
                        case 5: stats.Points = Int32.Parse(cols[j].InnerText); break;
                    }
                }

How would I effectively parse the href attribute value from this :

<tr>
<td rowspan="1" colspan="1">7</td>
<td rowspan="1" colspan="1">
<a class="undMe" href="/ice/player.htm?id=8475179" rel="skaterLinkData" shape="rect">D. Kulikov</a>
</td>
<td rowspan="1" colspan="1">D</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">0</td>
[...]

I am interested in having the player id, which is: 8475179 Here is the code I have so far:

        // Iterate all rows (players)
        for (int i = 1; i < rows.Count; ++i)
        {
            HtmlNodeCollection cols = rows[i].SelectNodes(".//td");

            // new player
            Dim_Player player = new Dim_Player();

                // Iterate all columns in this row
                for (int j = 1; j < 6; ++j)
                {
                    switch (j) {
                        case 1: player.Name = cols[j].InnerText;
                                player.Player_id = Int32.Parse(/* this is where I want to parse the href value */); 
                                break;
                        case 2: player.Position = cols[j].InnerText; break;
                        case 3: stats.Goals = Int32.Parse(cols[j].InnerText); break;
                        case 4: stats.Assists = Int32.Parse(cols[j].InnerText); break;
                        case 5: stats.Points = Int32.Parse(cols[j].InnerText); break;
                    }
                }

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

青芜 2024-12-28 07:56:56

根据您的示例,这对我有用:

HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.Load("test.html");
var link = htmlDoc.DocumentNode
                  .Descendants("a")
                  .First(x => x.Attributes["class"] != null 
                           && x.Attributes["class"].Value == "undMe");

string hrefValue = link.Attributes["href"].Value;
long playerId = Convert.ToInt64(hrefValue.Split('=')[1]);

对于实际使用,您需要添加错误检查等。

Based on your example this worked for me:

HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.Load("test.html");
var link = htmlDoc.DocumentNode
                  .Descendants("a")
                  .First(x => x.Attributes["class"] != null 
                           && x.Attributes["class"].Value == "undMe");

string hrefValue = link.Attributes["href"].Value;
long playerId = Convert.ToInt64(hrefValue.Split('=')[1]);

For real use you need to add error checking etc.

弃爱 2024-12-28 07:56:56

使用 XPath 表达式来查找它:

 foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a[@class='undMe']"))
 {
      HtmlAttribute att = link.Attributes["href"];
      Console.WriteLine(new Regex(@"(?<=[\?&]id=)\d+(?=\&|\#|$)").Match(att.Value).Value);
 }

Use an XPath expression to find it:

 foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a[@class='undMe']"))
 {
      HtmlAttribute att = link.Attributes["href"];
      Console.WriteLine(new Regex(@"(?<=[\?&]id=)\d+(?=\&|\#|$)").Match(att.Value).Value);
 }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文