Jsoup 根据条件选择
我有以下 html 表元素:
<table class='myTable'>
<tbody>
<tr>
<th>header1</th>
<td>data1</td>
</tr>
<tr>
<th>header2</th>
<td><table><tbody><tr><th>subheader1</th><td>subdata1</td></tr>
<tr><th>subheader2</th><td>subdata2</td></tr>
</tbody></table></td>
</tr>
<tr>
<th>header3</th>
<td>data3</td>
</tr>
....
<tbody>
</table>
如何选择表中的标题,其中这些标题的下一个 td 元素不包含表。在上述情况下,仅选择标头 header1
和 header3
。
我现在拥有的是
Elements elements = doc.select("table[class=" + myTable + "]);
Element table;
if(elements.size()>0){
table = elements.get(0);
}
else{
return someMyObj;
}
Iterator<Element> ite = table.select("th AND SOME CONDITIONS").iterator();
while(ite.hasNext()){
Element header = ite.next();
}
I have the following html table element:
<table class='myTable'>
<tbody>
<tr>
<th>header1</th>
<td>data1</td>
</tr>
<tr>
<th>header2</th>
<td><table><tbody><tr><th>subheader1</th><td>subdata1</td></tr>
<tr><th>subheader2</th><td>subdata2</td></tr>
</tbody></table></td>
</tr>
<tr>
<th>header3</th>
<td>data3</td>
</tr>
....
<tbody>
</table>
How could I select the headers in the table, where those headers's next td element does not contain a table. In the case above, only select header header1
and header3
.
What I have at the moment is
Elements elements = doc.select("table[class=" + myTable + "]);
Element table;
if(elements.size()>0){
table = elements.get(0);
}
else{
return someMyObj;
}
Iterator<Element> ite = table.select("th AND SOME CONDITIONS").iterator();
while(ite.hasNext()){
Element header = ite.next();
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
试试这个
选择器选择 tr 的所有子元素,这些子元素不包含 table,并且又是上下文元素的 tbody 的子元素。
顺便说一句,我将你的 while 循环更改为 for 循环,但想法保持不变。
Try this
The selector selects all th children of tr, that don't contain table and in turn are children of tbody of the context element.
BTW I changed your while loop to for loop, but the idea stays the same.