html表格转CSV,csv格式问题
我一直坚持从 html 表创建正确的 CSV 的想法。我正在使用 HTMLAgilityPack 从字符串读取 html 并创建 HTMLDocument。然后我使用 XPATH 循环遍历行和列。
问题是我无法确定特定单元格的正确行和单元格(x,y)。
HTML 示例:
<html>
<body>
<table border="1">
<tr>
<td rowspan="2">
100
</td>
<td>
200
</td>
<td colspan="2">
300
</td>
</tr>
<tr>
<td colspan="2">
400
</td>
<td>
600
</td>
</tr>
<tr>
<td>
400
</td>
<td>
500
</td>
<td>
600
</td>
</tr>
</table>
</body>
</html>
当我在 Excel 中打开它并另存为 CSV 时,我确实得到了所需的输出,即:
100,200,300,
,400,,600
400,500,600,
有人可以帮助我在 .Net 中创建相同的输出,尊重 rowpan 和 colspan 吗?
谢谢! 右旋糖酐
I am stuck with idea on creating proper CSV from an html table. I am using HTMLAgilityPack to read the html from string and create a HTMLDocument. Then I am using XPATH to loop through rows and columns.
The problem is that I am unable to determine the correct row and cell(x,y) for a particular cell.
Example HTML:
<html>
<body>
<table border="1">
<tr>
<td rowspan="2">
100
</td>
<td>
200
</td>
<td colspan="2">
300
</td>
</tr>
<tr>
<td colspan="2">
400
</td>
<td>
600
</td>
</tr>
<tr>
<td>
400
</td>
<td>
500
</td>
<td>
600
</td>
</tr>
</table>
</body>
</html>
When I open it in excel and save as CSV, I do get the desired output, which is:
100,200,300,
,400,,600
400,500,600,
Can someone help me create the same output in .Net respecting the rowpan and colspan?
Thanks!
Dex
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
你不需要知道你在哪一行哪一列。您需要做的就是为找到的每个新列添加一个“,”,并在每次到达行尾时添加一条分隔线。
如果您将文档视为 xml 文档进行导航,那么您所要做的就是遍历所有 TR 节点,并在到达子节点列表末尾时添加一条隔断线。并迭代所有TD节点,必要时在每个TR节点上添加“,”。
You don't need to know which row and column are you on. All you need to do is add a "," for each new column you found and a breakline every time you reach the end of a row.
If you navigate through the document considering it an xml document all you have to do is go through all TR nodes adding a breakline when you reach the end of the child nodes list. And iterate through all TD nodes on each TR node adding a "," when necessary.