如果一个 url 为 404,curl_multi_exec 就会停止,我该如何更改?

发布于 2024-08-30 16:31:26 字数 1330 浏览 7 评论 0原文

目前,如果我的 cURL multi exec 连接到的一个 url 不起作用,它就会停止,所以有几个问题:

1:为什么它会停止?这对我来说没有意义。

2:我怎样才能让它继续下去?

编辑:这是我的代码:

    $SQL = mysql_query("SELECT url FROM shells") ;
    $mh = curl_multi_init();
    $handles = array();
    while($resultSet = mysql_fetch_array($SQL)){       
            //load the urls and send GET data                     
            $ch = curl_init($resultSet['url'] . $fullcurl); 
            //Only load it for two seconds (Long enough to send the data)
            curl_setopt($ch, CURLOPT_TIMEOUT, 5);           
            curl_multi_add_handle($mh, $ch);
            $handles[] = $ch;
    }

    // Create a status variable so we know when exec is done.
    $running = null;
    //execute the handles
    do {
      // Call exec.  This call is non-blocking, meaning it works in the background.
      curl_multi_exec($mh,$running);
      // Sleep while it's executing.  You could do other work here, if you have any.
      sleep(2);
    // Keep going until it's done.
    } while ($running > 0);

    // For loop to remove (close) the regular handles.
    foreach($handles as $ch)
    {
      // Remove the current array handle.
      curl_multi_remove_handle($mh, $ch);
    } 
    // Close the multi handle
    curl_multi_close($mh);
`

Currently, my cURL multi exec stops if one url it connects to doesn't work, so a few questions:

1: Why does it stop? That doesn't make sense to me.

2: How can I make it continue?

EDIT: Here is my code:

    $SQL = mysql_query("SELECT url FROM shells") ;
    $mh = curl_multi_init();
    $handles = array();
    while($resultSet = mysql_fetch_array($SQL)){       
            //load the urls and send GET data                     
            $ch = curl_init($resultSet['url'] . $fullcurl); 
            //Only load it for two seconds (Long enough to send the data)
            curl_setopt($ch, CURLOPT_TIMEOUT, 5);           
            curl_multi_add_handle($mh, $ch);
            $handles[] = $ch;
    }

    // Create a status variable so we know when exec is done.
    $running = null;
    //execute the handles
    do {
      // Call exec.  This call is non-blocking, meaning it works in the background.
      curl_multi_exec($mh,$running);
      // Sleep while it's executing.  You could do other work here, if you have any.
      sleep(2);
    // Keep going until it's done.
    } while ($running > 0);

    // For loop to remove (close) the regular handles.
    foreach($handles as $ch)
    {
      // Remove the current array handle.
      curl_multi_remove_handle($mh, $ch);
    } 
    // Close the multi handle
    curl_multi_close($mh);
`

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

想念有你 2024-09-06 16:31:26

给你:

$urls = array
(
    0 => 'http://bing.com',
    1 => 'http://yahoo.com/thisfiledoesntexistsoitwill404.php', // 404
    2 => 'http://google.com',
);

$mh = curl_multi_init();
$handles = array();

foreach ($urls as $url)
{
    $handles[$url] = curl_init($url);

    curl_setopt($handles[$url], CURLOPT_TIMEOUT, 3);
    curl_setopt($handles[$url], CURLOPT_AUTOREFERER, true);
    curl_setopt($handles[$url], CURLOPT_FAILONERROR, true);
    curl_setopt($handles[$url], CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($handles[$url], CURLOPT_RETURNTRANSFER, true);
    curl_setopt($handles[$url], CURLOPT_SSL_VERIFYHOST, false);
    curl_setopt($handles[$url], CURLOPT_SSL_VERIFYPEER, false);

    curl_multi_add_handle($mh, $handles[$url]);
}

$running = null;

do {
    curl_multi_exec($mh, $running);
    usleep(200000);
} while ($running > 0);

foreach ($handles as $key => $value)
{
    $handles[$key] = false;

    if (curl_errno($value) === 0)
    {
        $handles[$key] = curl_multi_getcontent($value);
    }

    curl_multi_remove_handle($mh, $value);
    curl_close($value);
}

curl_multi_close($mh);

echo '<pre>';
print_r(array_map('htmlentities', $handles));
echo '</pre>';

返回:

Array
(
    [http://bing.com] => <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html...
    [http://yahoo.com/thisfiledoesntexistsoitwill404.php] => 
    [http://google.com] => <!doctype html><html><head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"><title>Google</title>...
)

如您所见,所有 URL 均已获取,甚至是 404 Yahoo 页面之后的 Google.com。

Here you go:

$urls = array
(
    0 => 'http://bing.com',
    1 => 'http://yahoo.com/thisfiledoesntexistsoitwill404.php', // 404
    2 => 'http://google.com',
);

$mh = curl_multi_init();
$handles = array();

foreach ($urls as $url)
{
    $handles[$url] = curl_init($url);

    curl_setopt($handles[$url], CURLOPT_TIMEOUT, 3);
    curl_setopt($handles[$url], CURLOPT_AUTOREFERER, true);
    curl_setopt($handles[$url], CURLOPT_FAILONERROR, true);
    curl_setopt($handles[$url], CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($handles[$url], CURLOPT_RETURNTRANSFER, true);
    curl_setopt($handles[$url], CURLOPT_SSL_VERIFYHOST, false);
    curl_setopt($handles[$url], CURLOPT_SSL_VERIFYPEER, false);

    curl_multi_add_handle($mh, $handles[$url]);
}

$running = null;

do {
    curl_multi_exec($mh, $running);
    usleep(200000);
} while ($running > 0);

foreach ($handles as $key => $value)
{
    $handles[$key] = false;

    if (curl_errno($value) === 0)
    {
        $handles[$key] = curl_multi_getcontent($value);
    }

    curl_multi_remove_handle($mh, $value);
    curl_close($value);
}

curl_multi_close($mh);

echo '<pre>';
print_r(array_map('htmlentities', $handles));
echo '</pre>';

Returns:

Array
(
    [http://bing.com] => <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html...
    [http://yahoo.com/thisfiledoesntexistsoitwill404.php] => 
    [http://google.com] => <!doctype html><html><head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"><title>Google</title>...
)

As you can see all URLs are fetched, even Google.com that comes after the 404 Yahoo page.

┊风居住的梦幻卍 2024-09-06 16:31:26

我没有一个平台来测试这个,但我见过的大多数示例都会比较从curl_multi_exec返回的常量,而不是检查 $running 变量。

//execute the handles
do {
  // Call exec.  This call is non-blocking, meaning it works in the background.
  $mrc = curl_multi_exec($mh,$running);
  // Sleep while it's executing.  You could do other work here, if you have any.
  sleep(2);
// Keep going until it's done.
} while ($mrc == CURLM_CALL_MULTI_PERFORM);

我希望这对你有用。

I don't have a platform to test this on but most of the examples I've seen compare the constant returned from curl_multi_exec instead of checking the $running variable.

//execute the handles
do {
  // Call exec.  This call is non-blocking, meaning it works in the background.
  $mrc = curl_multi_exec($mh,$running);
  // Sleep while it's executing.  You could do other work here, if you have any.
  sleep(2);
// Keep going until it's done.
} while ($mrc == CURLM_CALL_MULTI_PERFORM);

I hope this works for you.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文