如何修复 PHP 中的最大执行时间错误?

发布于 2024-08-11 08:26:43 字数 2717 浏览 2 评论 0原文

我有一个 PHP 页面,我通过 CRON 作业每分钟运行一次。

我已经运行它很长一段时间了,但突然它开始抛出这些错误:

Maximum execution time of 30 seconds exceeded in /home2/sharingi/public_html/scrape/functions.php on line 84

行号会随每个错误而变化,范围从第 70 行到第 90 行。

以下是第 0-95 行的代码

function crawl_page( $base_url, $target_url, $userAgent, $links)
{
    $ch = curl_init();

    curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
    curl_setopt($ch, CURLOPT_URL,$target_url);
    curl_setopt($ch, CURLOPT_FAILONERROR, false);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_AUTOREFERER, true);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
    curl_setopt($ch, CURLOPT_TIMEOUT, 100);
    curl_setopt($ch, CURLOPT_MAXREDIRS, 10); //follow up to 10 redirections - avoids loops

    $html = curl_exec($ch);

    if (!$html) 
    {
        echo "<br />cURL error number:" .curl_errno($ch);
        echo "<br />cURL error:" . curl_error($ch);
        //exit;
    }

    //
    // load scrapped data into the DOM
    //

    $dom = new DOMDocument();
    @$dom->loadHTML($html);

    //
    // get only LINKS from the DOM with XPath
    //

    $xpath = new DOMXPath($dom);
    $hrefs = $xpath->evaluate("/html/body//a");

    //
    // go through all the links and store to db or whatever
    //  

    for ($i = 0; $i < $hrefs->length; $i++) 
    {
        $href = $hrefs->item($i);
        $url = $href->getAttribute('href');

        //if the $url does not contain the web site base address: http://www.thesite.com/ then add it onto the front

        $clean_link = clean_url( $base_url, $url, $target_url);
        $clean_link = str_replace( "http://" , "" , $clean_link);
        $clean_link = str_replace( "//" , "/" , $clean_link);

        $links[] = $clean_link;

        //removes empty array values

        foreach($links as $key => $value) 
        { 
            if($value == "") 
            { 
                unset($links[$key]); 
            } 
        } 
        $links = array_values($links); 

        //removes javascript lines

        foreach ($links as $key => $value)
        {
            if ( strpos( $value , "javascript:") !== FALSE )
            {
                unset($links[$key]);
            }
        }
        $links = array_values($links);

        // removes @ lines (email)

        foreach ($links as $key => $value)
        {
            if ( strpos( $value , "@") !== FALSE || strpos( $value, 'mailto:') !== FALSE)
            {
                unset($links[$key]);
            }
        }
        $links = array_values($links);
    }   

    return $links; 
}

是什么导致了这些错误,如何防止这些错误?

I have a PHP page that I run every minute through a CRON job.

I have been running it for quite some time but suddenly it started throwing up these errors:

Maximum execution time of 30 seconds exceeded in /home2/sharingi/public_html/scrape/functions.php on line 84

The line number will vary with each error, ranging from line 70 up into the 90s.

Here is the code from lines 0-95

function crawl_page( $base_url, $target_url, $userAgent, $links)
{
    $ch = curl_init();

    curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
    curl_setopt($ch, CURLOPT_URL,$target_url);
    curl_setopt($ch, CURLOPT_FAILONERROR, false);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_AUTOREFERER, true);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
    curl_setopt($ch, CURLOPT_TIMEOUT, 100);
    curl_setopt($ch, CURLOPT_MAXREDIRS, 10); //follow up to 10 redirections - avoids loops

    $html = curl_exec($ch);

    if (!$html) 
    {
        echo "<br />cURL error number:" .curl_errno($ch);
        echo "<br />cURL error:" . curl_error($ch);
        //exit;
    }

    //
    // load scrapped data into the DOM
    //

    $dom = new DOMDocument();
    @$dom->loadHTML($html);

    //
    // get only LINKS from the DOM with XPath
    //

    $xpath = new DOMXPath($dom);
    $hrefs = $xpath->evaluate("/html/body//a");

    //
    // go through all the links and store to db or whatever
    //  

    for ($i = 0; $i < $hrefs->length; $i++) 
    {
        $href = $hrefs->item($i);
        $url = $href->getAttribute('href');

        //if the $url does not contain the web site base address: http://www.thesite.com/ then add it onto the front

        $clean_link = clean_url( $base_url, $url, $target_url);
        $clean_link = str_replace( "http://" , "" , $clean_link);
        $clean_link = str_replace( "//" , "/" , $clean_link);

        $links[] = $clean_link;

        //removes empty array values

        foreach($links as $key => $value) 
        { 
            if($value == "") 
            { 
                unset($links[$key]); 
            } 
        } 
        $links = array_values($links); 

        //removes javascript lines

        foreach ($links as $key => $value)
        {
            if ( strpos( $value , "javascript:") !== FALSE )
            {
                unset($links[$key]);
            }
        }
        $links = array_values($links);

        // removes @ lines (email)

        foreach ($links as $key => $value)
        {
            if ( strpos( $value , "@") !== FALSE || strpos( $value, 'mailto:') !== FALSE)
            {
                unset($links[$key]);
            }
        }
        $links = array_values($links);
    }   

    return $links; 
}

What is causing these errors, and how can I prevent them?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

浅浅淡淡 2024-08-18 08:26:43

您应该使用 set_time_limit 函数设置 max_execution 时间。如果您想要无限的时间(很可能是您的情况),请使用:

set_time_limit(0);

You should set the max_execution time using the set_time_limit function. If you want infinite time (most likely your case), use:

set_time_limit(0);
離殇 2024-08-18 08:26:43

原因:某些功能需要超过 30 秒才能完成。

解决方案:增加php配置文件中的最大执行时间(ma​​x_execution_time)。

1. 如果您有权访问全局 php.ini 文件(通常位于 /web/conf,否则您可以从 phpinfo 中的配置文件 (php.ini) 路径获取位置),请将 max_execution_time=30 更改为 ma​​x_execution_time=300 .

2. 如果您只能访问本地 php.ini 文件(您可以从 phpinfo 中的加载配置文件获取位置),请将 max_execution_time=30 更改为 max_execution_time=300。注意:对于 php 5.x+,该文件名为 php5.ini;对于 4.x,该文件名为 php.ini。

Cause : Some of the functions takes more than 30 seconds to complete.

Solution : Increase the maximum execution time (max_execution_time) in the php configuration file.

1. If you have access to your global php.ini file (usually at /web/conf else you can get the location from Configuration File (php.ini) Path in phpinfo), change max_execution_time=30 to max_execution_time=300.

2. If you have access only to your local php.ini file (you can get the location from Loaded Configuration File in phpinfo), change max_execution_time=30 to max_execution_time=300. Note : this file is named php5.ini for php 5.x+ and php.ini for 4.x.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文