关于curl、xpath查询的问题

发布于 2024-10-12 02:42:56 字数 1609 浏览 5 评论 0原文

我的 xpath 查询需要一些帮助。我可以让这段代码与我需要抓取的几乎每个网站一起使用，除了特定网站的一小部分...我只是得到一个空白页面...有谁知道我如何才能做得更好？

//
$target_url = "http://www.teambuy.ca/vancouver/";
$userAgent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)';

// make the cURL request to $target_url
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT,$userAgent);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$html= curl_exec($ch);
if (!$html) {
    echo "<br />cURL error number:" .curl_errno($ch);
    echo "<br />cURL error:" . curl_error($ch);
    exit;
}

// parse the html into a DOMDocument
$dom = new DOMDocument();
@$dom->loadHTML($html);

// grab all the on the page
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body/div[@id='pagewrap']/div[@id='content']/div[@id='bottomSection']/div[@id='bottomRight']/div[@id='sideDeal']/div[2]/div/a/center/span");

foreach ($hrefs as $e) {
    $e->nodeValue;
}
$insert = $e->nodeValue;
echo "$insert";

--编辑--

运气不好...... 致命错误：在...第 4 行中的非对象上调用成员函数 loadHTMLfile() //

$xpath_query = $dom->loadHTMLfile("http://www.teambuy.ca/vancouver/");

$hrefs = $xpath_query->evaluate("/html/body/div[7]/div[4]/div[3]/div[2]/div[1]/div[2]/div/a/center/span");

foreach ($hrefs as $e) {
    echo $e->nodeValue;
}
$insert = $e->nodeValue;

echo "$insert";

原文

I need some help with my xpath query. I can get this code to work with just about every site I need to scrape except this small part of a particular site... I just get a blank page... Does anyone have an idea on how I can do this better?

//
$target_url = "http://www.teambuy.ca/vancouver/";
$userAgent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)';

// make the cURL request to $target_url
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT,$userAgent);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$html= curl_exec($ch);
if (!$html) {
    echo "<br />cURL error number:" .curl_errno($ch);
    echo "<br />cURL error:" . curl_error($ch);
    exit;
}

// parse the html into a DOMDocument
$dom = new DOMDocument();
@$dom->loadHTML($html);

// grab all the on the page
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body/div[@id='pagewrap']/div[@id='content']/div[@id='bottomSection']/div[@id='bottomRight']/div[@id='sideDeal']/div[2]/div/a/center/span");

foreach ($hrefs as $e) {
    $e->nodeValue;
}
$insert = $e->nodeValue;
echo "$insert";

--EDIT--

No luck...
Fatal error: Call to a member function loadHTMLfile() on a non-object in ... Line 4
//

$xpath_query = $dom->loadHTMLfile("http://www.teambuy.ca/vancouver/");

$hrefs = $xpath_query->evaluate("/html/body/div[7]/div[4]/div[3]/div[2]/div[1]/div[2]/div/a/center/span");

foreach ($hrefs as $e) {
    echo $e->nodeValue;
}
$insert = $e->nodeValue;

echo "$insert";

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

稀香 2024-10-19 02:42:56

不要使用卷曲。只是使用

$dom->loadHTMLFile("http://www.teambuy.ca/calgary/");

不要使用

$xpath = new DOMXPath($dom);

只是使用

$href = $dom->xpath($xpath_query);

我想你的 xpath 查询也可以简化......

而且，

foreach ($hrefs as $e) {
    $e->nodeValue;
}

什么也不做。可能想尝试这个。

foreach ($hrefs as $e) {
    echo $e->nodeValue;
}

don't use cURL. just use

$dom->loadHTMLFile("http://www.teambuy.ca/calgary/");

don't use

$xpath = new DOMXPath($dom);

just use

$href = $dom->xpath($xpath_query);

I imagine your xpath query could be simplified as well...

also,

foreach ($hrefs as $e) {
    $e->nodeValue;
}

does nothing. might want to try this instead.

foreach ($hrefs as $e) {
    echo $e->nodeValue;
}

回复收藏 0 原文

~没有更多了~

关于作者

夏末

暂无简介

0 文章

0 评论

22 人气

关注发私信

束缚ｍ

文章 0 评论 0

关注

alipaysp_VP2a8Q4rgx

文章 0 评论 0

关注

α

文章 0 评论 0

关注

一口甜

文章 0 评论 0

关注

厌味

文章 0 评论 0

关注

转身泪倾城

文章 0 评论 0

友情链接

文江博客

关于curl、xpath查询的问题

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

束缚ｍ

alipaysp_VP2a8Q4rgx

α

一口甜

厌味

转身泪倾城

友情链接

关于curl、xpath查询的问题

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

束缚ｍ

alipaysp_VP2a8Q4rgx

α

一口甜

厌味

转身泪倾城

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。