如何使用简单的 HTML DOM 来抓取它
我正在尝试使用简单的 html dom 从如下所示的文件中提取元素。
- 该文件有几个看起来相同的表
class=sometable
。 - 每个表都有一些
。
- 然后在每个 tr 中,我都有一个具有标题的 th 和一个具有类别的 td。
我想要提取的是所有表中所有表行的所有标题 class=title
及其相应的类别号 class=category
。我已将文件加载到 $html
中。有人能告诉我之后我应该找到什么吗?我什至尝试过 $collection = $html->find('tr');
并对集合进行了 vardump,但什么也没得到,所以看起来我没有选择正确。
<table class="sometable">
<tbody>
<tr class="sometr">
<th><a class="title">Table 1 Title1</a></th>
<td class="category" id="categ-113"></td>
<td class="somename">Table 1 Title 1 name</td>
</tr>
<tr></tr>
<tr></tr>
</tbody>
</table>
<table class="sometable">
</table>
<table class="sometable">
</table>
I'm trying to use simple html dom to extract elements from a file that looks like this.
- The file has several tables that look the same
class=sometable
. - Each table has a few
<tr class=sometr>
. - Then inside each tr, I have
th
that has the title, and a td that has a category.
What I want to extract is all titles class=title
and their corresponding category number class=category
for all table rows in all tables. I've loaded the file in $html
. Can someone tell me what I'm supposed to find after that? I've tried even $collection = $html->find('tr');
and did a vardump on the collection but got nothing, so it looks like I'm not selecting right.
<table class="sometable">
<tbody>
<tr class="sometr">
<th><a class="title">Table 1 Title1</a></th>
<td class="category" id="categ-113"></td>
<td class="somename">Table 1 Title 1 name</td>
</tr>
<tr></tr>
<tr></tr>
</tbody>
</table>
<table class="sometable">
</table>
<table class="sometable">
</table>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我已经测试了这个并且它有效
I have tested this and it works