如何访问简单 DOM 选择器?
我可以使用 a 访问一些“类”项目
$ret = $html->find('articleINfo'); and then print the first key of the returned array.
但是,我还需要其他标签,例如 span=id"firstArticle_0",但我似乎找不到它。
$ret = $html->find('#span=id[ etc ]');
在某些情况下,会返回一些内容,但它不是数组,或者是带有空键的数组。
不幸的是,我无法使用 var_dump 来查看该对象,因为 var_dump 会产生 1000 页不可读的垃圾。代码如下所示。
<div id="articlething">
<p class="byline">By Lord Byron and <a href="www.marriedtothesea.com">Alister Crowley</a></p>
<p>
<span class="location">GEORGIA MOUNTAINS, Canada</span> |
<span class="timestamp">Fri Apr 29, 2011 11:27am EDT</span>
</p>
</div>
<span id="midPart_0"></span><span class="mainParagraph"><p><span class="midLocation">TUSCALOOSA, Alabama</span> - Who invented cheese? Everyone wants to know. They held a big meeting. Tom Cruise is a scientologist. </p>
</span><span id="midPart_1"></span><p>The president and his family visited Chuck-e-cheese in the morning </p><span id="midPart_2"></span><p>In Russia, 900 people were lost in the balls.</p><span id="midPart_3">
I can access some of the 'class' items with a
$ret = $html->find('articleINfo'); and then print the first key of the returned array.
However, there are other tags I need like span=id"firstArticle_0" and I cannot seem to find it.
$ret = $html->find('#span=id[ etc ]');
In some cases something is returned, but it's not an array, or is an array with empty keys.
Unfortunately I cannot use var_dump to see the object, since var_dump produces 1000 pages of unreadable junk. The code looks like this.
<div id="articlething">
<p class="byline">By Lord Byron and <a href="www.marriedtothesea.com">Alister Crowley</a></p>
<p>
<span class="location">GEORGIA MOUNTAINS, Canada</span> |
<span class="timestamp">Fri Apr 29, 2011 11:27am EDT</span>
</p>
</div>
<span id="midPart_0"></span><span class="mainParagraph"><p><span class="midLocation">TUSCALOOSA, Alabama</span> - Who invented cheese? Everyone wants to know. They held a big meeting. Tom Cruise is a scientologist. </p>
</span><span id="midPart_1"></span><p>The president and his family visited Chuck-e-cheese in the morning </p><span id="midPart_2"></span><p>In Russia, 900 people were lost in the balls.</p><span id="midPart_3">
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
可以轻松使用简单的 HTML DOM 来查找具有特定类的跨度。
如果想要所有跨度都带有 class=location 那么:
然后执行类似的操作:
使用您的示例对我有用我的输出是:
label=span, class=location: Found 1
0: GEORGIA MOUNTAINS, Canada
希望有帮助...请简单的 HTML DOM 非常适合它的用途,并且一旦掌握了它的窍门就很容易使用。继续尝试,您将获得许多可以反复使用的示例。我已经抓取了一些非常疯狂的页面,它们变得越来越容易。
Simple HTML DOM can be used easily to find a span with a specific class.
If want all span's with class=location then:
Then do something like:
It worked for me using your example my output was:
label=span, class=location: Found 1
0: GEORGIA MOUNTAINS, Canada
Hope that helps... and please Simple HTML DOM is great for what it does and easy to use once you get the hang of it. Keep trying and you will have a number of examples that you just use over and over again. I've scraped some pretty crazy pages and they get easier and easier.
尝试使用这个。对我来说效果很好并且非常容易使用。 http://code.google.com/p/phpquery/
Try using this. Worked for me very well and extremely easy to use. http://code.google.com/p/phpquery/
PHP Simple DOM 解析器上的文档在破译 Open Graph 元标记方面参差不齐。这似乎对我有用:
The docs on the PHP Simple DOM parser are spotty on deciphering Open Graph meta tags. Here's what seems to work for me: