使用 HtmlAgilityPack 解析 html 文档
我正在尝试通过 HtmlAgilityPack 解析以下 html 片段:
<td bgcolor="silver" width="50%" valign="top">
<table bgcolor="silver" style="font-size: 90%" border="0" cellpadding="2" cellspacing="0"
width="100%">
<tr bgcolor="#003366">
<td>
<font color="white">Info
</td>
<td>
<font color="white">
<center>Price
</td>
<td align="right">
<font color="white">Hourly
</td>
</tr>
<tr>
<td>
<a href='test1.cgi?type=1'>Bookbags</a>
</td>
<td>
$156.42
</td>
<td align="right">
<font color="green">0.11%</font>
</td>
</tr>
<tr>
<td>
<a href='test2.cgi?type=2'>Jeans</a>
</td>
<td>
$235.92
</td>
<td align="right">
<font color="red">100%</font>
</td>
</tr>
</table>
</td>
我的代码如下所示:
private void ParseHtml(HtmlDocument htmlDoc)
{
var ItemsAndPrices = new Dictionary<string, int>();
var findItemPrices = from links in htmlDoc.DocumentNode.Descendants()
where links.Name.Equals("table") &&
links.Attributes["width"].Equals ("100%") &&
links.Attributes["bgcolor"].Equals("silver")
select new
{
//select item and price
}
在本例中,我想选择牛仔裤和书包的项目以及它们相关的 < code>prices 下面并将它们存储在字典中。
E.g Jeans at price $235.92
有谁知道如何通过 htmlagility pack 和 LINQ 正确执行此操作?
I'm trying to parse the following html snippet via HtmlAgilityPack:
<td bgcolor="silver" width="50%" valign="top">
<table bgcolor="silver" style="font-size: 90%" border="0" cellpadding="2" cellspacing="0"
width="100%">
<tr bgcolor="#003366">
<td>
<font color="white">Info
</td>
<td>
<font color="white">
<center>Price
</td>
<td align="right">
<font color="white">Hourly
</td>
</tr>
<tr>
<td>
<a href='test1.cgi?type=1'>Bookbags</a>
</td>
<td>
$156.42
</td>
<td align="right">
<font color="green">0.11%</font>
</td>
</tr>
<tr>
<td>
<a href='test2.cgi?type=2'>Jeans</a>
</td>
<td>
$235.92
</td>
<td align="right">
<font color="red">100%</font>
</td>
</tr>
</table>
</td>
My code looks something like this:
private void ParseHtml(HtmlDocument htmlDoc)
{
var ItemsAndPrices = new Dictionary<string, int>();
var findItemPrices = from links in htmlDoc.DocumentNode.Descendants()
where links.Name.Equals("table") &&
links.Attributes["width"].Equals ("100%") &&
links.Attributes["bgcolor"].Equals("silver")
select new
{
//select item and price
}
In this instance, I would like to select the item which are Jeans and Bookbags
as well as their associated prices
below and store them in a dictionary.
E.g Jeans at price $235.92
Does anyone know how to do this properly via htmlagility pack and LINQ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
试试这个:
正则表达式解决方案:
然后:
输出:
注意:
Dictionary
的值不能是int
,因为$235.92
/$156.42
不是有效的int
。要将其转换为有效的 int,您可以删除美元和点符号并使用Try this:
Regex solution:
Then:
Output:
Note:
The value of
Dictionary
can't be anint
because$235.92
/$156.42
is not an validint
. to transform it to an int valid, you can remove the dollar and dot symbol and use这就是我的想法:
我唯一改变的是你的
,因为$156.42
不是 intHere's what I came up with:
The only thing I changed was your
<string, int>
, because$156.42
isn't an int假设可能还有其他行,并且您不仅仅只想要书包和牛仔裤,我会这样做:
Assuming that there could be other rows and you don't specifically want only Bookbags and Jeans, I'd do it like this: