getElementsByTagName 标题将随 DOMNodeList 对象返回
我们的脚本使用 dom 解析文档中的所有 a 标签,然后循环遍历子节点并提取信息,该信息工作正常,这是代码的启动方式
@$dom->loadHTML($str);
$documentLinks = $dom->getElementsByTagName("a");
循环的一部分
$this->count]['href'] = strip_tags($documentLink->getAttribute('href'));
我现在需要从每个页面获取标题标签,所以我想我可以这样做
$documentTitle = $dom->getElementsByTagName("title");
$documentLinks = $dom->getElementsByTagName("a");
然后将其添加到循环/数组中以获取文档标题,但它返回“[title] => DOMNodeList Object()”如何在正在经历标签/的循环中包含标题标签孩子节点?
$this->count]['title'] = $documentTitle;
Our script uses dom to parse all the a tags from a document then loops through child nodes and extracts information which works fine here's how the code starts
@$dom->loadHTML($str);
$documentLinks = $dom->getElementsByTagName("a");
Part of the loop
$this->count]['href'] = strip_tags($documentLink->getAttribute('href'));
I now need to get the title tag from each page were lopping through so I thoguht I could do
$documentTitle = $dom->getElementsByTagName("title");
$documentLinks = $dom->getElementsByTagName("a");
Then add this to the loop/array to get the document title but it comes back with "[title] => DOMNodeList Object()" How can I include the title tag in the loop which is going through a tags/child nodes?
$this->count]['title'] = $documentTitle;
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
getElementsByTagName 返回 DOMNodeList 对象。您需要列表中第一个(应该只是一个页面标题)项目的文本内容。
试试这个:
getElementsByTagName returns a DOMNodeList object. You want the text content of the first (should only be one page title) item in the list.
Try this: