使用 Zend_Dom 作为屏幕抓取工具
如何?
更重要的是......
这个:
$url = 'http://php.net/manual/en/class.domelement.php';
$client = new Zend_Http_Client($url);
$response = $client->request();
$html = $response->getBody();
$dom = new Zend_Dom_Query($html);
$result = $dom->query('div.note');
Zend_Debug::dump($result);
给了我这个:
object(Zend_Dom_Query_Result)#867 (7) {
["_count":protected] => NULL
["_cssQuery":protected] => string(8) "div.note"
["_document":protected] => object(DOMDocument)#79 (0) {
}
["_nodeList":protected] => object(DOMNodeList)#864 (0) {
}
["_position":protected] => int(0)
["_xpath":protected] => NULL
["_xpathQuery":protected] => string(33) "//div[contains(@class, ' note ')]"
}
我一生都无法弄清楚如何用这个做任何事情。
我想提取检索到的数据的各个部分(即带有“note”类的 div 以及其中的任何元素...如文本和 url),但无法使任何内容正常工作。
有人向我指出 php.net 上的 DOMElement 类,但是当我尝试使用提到的一些方法时,我无法让事情正常工作。我如何从页面中抓取一大块 html 并通过它抓取各个部分?我如何检查我要拿回来的这个物体,以便我至少可以弄清楚里面有什么?
海尔普?
How?
More to the point...
this:
$url = 'http://php.net/manual/en/class.domelement.php';
$client = new Zend_Http_Client($url);
$response = $client->request();
$html = $response->getBody();
$dom = new Zend_Dom_Query($html);
$result = $dom->query('div.note');
Zend_Debug::dump($result);
gives me this:
object(Zend_Dom_Query_Result)#867 (7) {
["_count":protected] => NULL
["_cssQuery":protected] => string(8) "div.note"
["_document":protected] => object(DOMDocument)#79 (0) {
}
["_nodeList":protected] => object(DOMNodeList)#864 (0) {
}
["_position":protected] => int(0)
["_xpath":protected] => NULL
["_xpathQuery":protected] => string(33) "//div[contains(@class, ' note ')]"
}
And I cannot for the life of me figure out how to do anything with this.
I want to extract the various parts of the retrieved data (that being the div with the class "note" and any of the elements inside it... like the text and urls) but cannot get anything working.
Someone pointed me to the DOMElement class over at php.net but when I try using some of the methods mentioned, I can't get things to work. How would I grab a chunk of html from a page and go through it grabbing the various parts? How do I inspect this object I am getting back so I can at least figure out what is in it?
Hjälp?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Zend_Dom_Query_Result
的Iterator
实现为每次迭代返回一个DOMElement
对象:从 $element 变量中,您可以使用任何 DOMElement 方法:
您还可以访问 文档元素,或者您可以使用
Zend_Dom_Query_Result
来执行此操作:The
Iterator
implementation ofZend_Dom_Query_Result
returns aDOMElement
object for each iteration:From the $element variable, you can use any DOMElement method:
You can also access the document element, or you can use
Zend_Dom_Query_Result
to do so: