使用 CURL 和 PHP 通过代理获取 Google 搜索结果时遇到问题

发布于 2024-12-22 08:32:49 字数 989 浏览 0 评论 0原文

该脚本在获取 google.com 时工作正常,但在获取 google.com/search?q=test 时则无法正常工作。当我不使用 CURLOPT_FOLLOWLOCATION 时,我会收到 302 Moved。当我使用它时,我会看到一个页面,要求我输入验证码。我尝试了几种不同的美国代理,并改变了用户代理字符串。我在这里缺少什么吗?

function my_fetch($url,$proxy,$user_agent='Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.8) Gecko/2009032609 Firefox/3.0.8') 
{ 
    $ch = curl_init(); 
    curl_setopt ($ch, CURLOPT_URL, $url); 
    curl_setopt ($ch, CURLOPT_PROXY, $proxy);
    curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent); 
    curl_setopt ($ch, CURLOPT_HEADER, 0);
    curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt ($ch, CURLOPT_REFERER, 'http://www.google.com/'); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);

    curl_setopt ($ch, CURLOPT_TIMEOUT, 20);
    $result = curl_exec ($ch); 
    curl_close ($ch); 
    return $result; 
}

$url = 'http://www.google.com/search?q=test';

$proxy = '152.26.53.4:80';
echo my_fetch($url,$proxy);

请不要回复使用 API 的建议。 API 不足以满足我的需求。

This script works fine when getting google.com but not with google.com/search?q=test. When I don't use CURLOPT_FOLLOWLOCATION, I get a 302 Moved. When I do use it, I get a page asking me to input a captcha. I've tried several different U.S. based proxies and have varied the user agent string. Is there something I'm missing here?

function my_fetch($url,$proxy,$user_agent='Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.8) Gecko/2009032609 Firefox/3.0.8') 
{ 
    $ch = curl_init(); 
    curl_setopt ($ch, CURLOPT_URL, $url); 
    curl_setopt ($ch, CURLOPT_PROXY, $proxy);
    curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent); 
    curl_setopt ($ch, CURLOPT_HEADER, 0);
    curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt ($ch, CURLOPT_REFERER, 'http://www.google.com/'); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);

    curl_setopt ($ch, CURLOPT_TIMEOUT, 20);
    $result = curl_exec ($ch); 
    curl_close ($ch); 
    return $result; 
}

$url = 'http://www.google.com/search?q=test';

$proxy = '152.26.53.4:80';
echo my_fetch($url,$proxy);

Please don't respond with suggestions to use the API instead. The API is not sufficient for my needs.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

仅此而已 2024-12-29 08:32:49

Google 不再支持 cURL。

Google 不再通过 Curl 提供访问权限,它可能会向您提供 302 Moved 消息,如果您想使用它,则必须使用 API。

谢谢

Google is No Longer for cURL.

Google is no longer giving access through Curl, it may gives you 302 Moved message, If you want to use it, you have to use the API for it.

Thanks

小瓶盖 2024-12-29 08:32:49

你可以尝试使用 PhantomJS 来做到这一点:

var page = require("webpage").create();
var homePage = "http://www.google.com/";

page.open(homePage);
page.onLoadFinished = function(status) {
 var url = page.url;

console.log("Status:  " + status);
console.log("Loaded:  " + url);


page.includeJs("http://code.jquery.com/jquery-1.8.3.min.js", function() {
  console.log("Loaded jQuery!");
  page.evaluate(function() {
    var searchBox = $(".lst");
    var searchForm = $("form");

    searchBox.val("your query");
    searchForm.submit();
  });
});

window.setTimeout(
        function () {
          page.render( 'google.png' );
          phantom.exit(0);
        },
        1000 // wait 5,000ms (5s)
      );


};

You can try to do that with PhantomJS:

var page = require("webpage").create();
var homePage = "http://www.google.com/";

page.open(homePage);
page.onLoadFinished = function(status) {
 var url = page.url;

console.log("Status:  " + status);
console.log("Loaded:  " + url);


page.includeJs("http://code.jquery.com/jquery-1.8.3.min.js", function() {
  console.log("Loaded jQuery!");
  page.evaluate(function() {
    var searchBox = $(".lst");
    var searchForm = $("form");

    searchBox.val("your query");
    searchForm.submit();
  });
});

window.setTimeout(
        function () {
          page.render( 'google.png' );
          phantom.exit(0);
        },
        1000 // wait 5,000ms (5s)
      );


};
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文