如何使用美丽的汤解析表内表?
我试过这个: s = soup.findAll("table", {"class": "view"})
但它给了我桌子。但我需要表内表。
<table class="view" >
<tr>
<td width="46%" valign="top">
<table>
<tr>
<td>
<div style="adasdasd">
<div class="abc">dasdsadasdasdas</div>
</div>
<div>
<span><span class="aaaaaaa " title="aaaaaaaaaaa"><span>aaaaaaaaaaaaa</span></span> </span>
<b>My Face</b><br />
Hello This is me,
</div>
<div class="abc"">
Dec 6, 2010 by Alis
</div>
</td>
</tr>
</table>
</tr>
</table>
The things I want to scrap is:
Hello This is me,
My Face
Dec 6, 2010 by Alis
I tried this:s = soup.findAll("table", {"class": "view"})
But it is giving me the table. But I need the table inside table.
<table class="view" >
<tr>
<td width="46%" valign="top">
<table>
<tr>
<td>
<div style="adasdasd">
<div class="abc">dasdsadasdasdas</div>
</div>
<div>
<span><span class="aaaaaaa " title="aaaaaaaaaaa"><span>aaaaaaaaaaaaa</span></span> </span>
<b>My Face</b><br />
Hello This is me,
</div>
<div class="abc"">
Dec 6, 2010 by Alis
</div>
</td>
</tr>
</table>
</tr>
</table>
The things I want to scrap is:
Hello This is me,
My Face
Dec 6, 2010 by Alis
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果只有一个表,您也可以对第一个表使用
.find
,然后删除[0]
。If there's just the one table, you could use
.find
for the first one too, and drop the[0]
.这是一些格式更好的 html:
注意:我实际上添加了一个标签,因为它缺少一个。
这样您就可以找到包含所有内容的地方。从那里只需进行一点解析即可获取您实际需要的内容。
Heres some better formatted html:
Note: I actually added a tag because it was missing one.
So that will get you to that that holds all of your content. From there it's just a little bit of parsing to get the content you actually need.