HTML 敏捷包
我在一个网页中有 html 表,就像
<table border=1>
<tr><td>sno</td><td>sname</td></tr>
<tr><td>111</td><td>abcde</td></tr>
<tr><td>213</td><td>ejkll</td></tr>
</table>
<table border=1>
<tr><td>adress</td><td>phoneno</td><td>note</td></tr>
<tr><td>asdlkj</td><td>121510</td><td>none</td></tr>
<tr><td>asdlkj</td><td>214545</td><td>none</td></tr>
</table>
现在从这个网页使用 html 敏捷包我想提取列地址和电话号码的数据。这意味着我首先找到哪个表中有列地址和电话号码。找到该表后我想提取该列地址和电话号码的数据我该怎么办?
我可以拿到桌子。但之后我该怎么办就不明白了。
还有一件事:我们可以通过列名从表中提取数据是可行的。
I have html tables in one webpage like
<table border=1>
<tr><td>sno</td><td>sname</td></tr>
<tr><td>111</td><td>abcde</td></tr>
<tr><td>213</td><td>ejkll</td></tr>
</table>
<table border=1>
<tr><td>adress</td><td>phoneno</td><td>note</td></tr>
<tr><td>asdlkj</td><td>121510</td><td>none</td></tr>
<tr><td>asdlkj</td><td>214545</td><td>none</td></tr>
</table>
Now from this webpage using html agility pack I want to extract the data of the column address and phone no only. It means for that I have find first in which table there is column address and phoneno.After finding that table I want to extract the data of that column address and phoneno what should I do ?
I can get the table. But after that what should I do don't understand.
And other thing : is feasible that we can extract data from the table through column name.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
以下是一些帮助器方法,可帮助您将 HTML 表解析为
DataTable
实例。您只需迭代生成的DataTable
数组即可找到包含所需列的数组。该代码与 HTML 中的表格格式相结合,在本例中,它从第一行获取列信息 ()。另请注意,不会执行错误检查,因此这将破坏不遵循您指定格式的表。
辅助方法:
使用示例:
Here are some helper methods to help you parse HTML tables to
DataTable
instances. You can just iterate through the resultingDataTable
array to find the one containing the columns you want. The code is coupled with the format of the tables in the HTML, in this case it obtains column information from the first row (<tr>
). Also note that no error checking is performed, so this will break will tables that do not follow the format you specified.Helper methods:
Usage example:
循环遍历表行并按索引获取列值
如果您可以修改网页,则可以使用 thead 作为标题文本,使用 tbody 作为实际值。
那么你就不必跳过第一行。
看看一些 xpath 教程,它对于 HtmlAgilityPack 非常有用。
Loop through tablerows and get column values by index
If you can modify the webpage, you could use thead for header texts and tbody for actual values.
Then you wouldn't have to skip the first row.
Have a look at some xpath tutorial, it's very useful with HtmlAgilityPack.