如何修复 PHP 中的最大执行时间错误?
我有一个 PHP 页面,我通过 CRON 作业每分钟运行一次。
我已经运行它很长一段时间了,但突然它开始抛出这些错误:
Maximum execution time of 30 seconds exceeded in /home2/sharingi/public_html/scrape/functions.php on line 84
行号会随每个错误而变化,范围从第 70 行到第 90 行。
以下是第 0-95 行的代码
function crawl_page( $base_url, $target_url, $userAgent, $links)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_FAILONERROR, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 100);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10); //follow up to 10 redirections - avoids loops
$html = curl_exec($ch);
if (!$html)
{
echo "<br />cURL error number:" .curl_errno($ch);
echo "<br />cURL error:" . curl_error($ch);
//exit;
}
//
// load scrapped data into the DOM
//
$dom = new DOMDocument();
@$dom->loadHTML($html);
//
// get only LINKS from the DOM with XPath
//
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");
//
// go through all the links and store to db or whatever
//
for ($i = 0; $i < $hrefs->length; $i++)
{
$href = $hrefs->item($i);
$url = $href->getAttribute('href');
//if the $url does not contain the web site base address: http://www.thesite.com/ then add it onto the front
$clean_link = clean_url( $base_url, $url, $target_url);
$clean_link = str_replace( "http://" , "" , $clean_link);
$clean_link = str_replace( "//" , "/" , $clean_link);
$links[] = $clean_link;
//removes empty array values
foreach($links as $key => $value)
{
if($value == "")
{
unset($links[$key]);
}
}
$links = array_values($links);
//removes javascript lines
foreach ($links as $key => $value)
{
if ( strpos( $value , "javascript:") !== FALSE )
{
unset($links[$key]);
}
}
$links = array_values($links);
// removes @ lines (email)
foreach ($links as $key => $value)
{
if ( strpos( $value , "@") !== FALSE || strpos( $value, 'mailto:') !== FALSE)
{
unset($links[$key]);
}
}
$links = array_values($links);
}
return $links;
}
是什么导致了这些错误,如何防止这些错误?
I have a PHP page that I run every minute through a CRON job.
I have been running it for quite some time but suddenly it started throwing up these errors:
Maximum execution time of 30 seconds exceeded in /home2/sharingi/public_html/scrape/functions.php on line 84
The line number will vary with each error, ranging from line 70 up into the 90s.
Here is the code from lines 0-95
function crawl_page( $base_url, $target_url, $userAgent, $links)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_FAILONERROR, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 100);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10); //follow up to 10 redirections - avoids loops
$html = curl_exec($ch);
if (!$html)
{
echo "<br />cURL error number:" .curl_errno($ch);
echo "<br />cURL error:" . curl_error($ch);
//exit;
}
//
// load scrapped data into the DOM
//
$dom = new DOMDocument();
@$dom->loadHTML($html);
//
// get only LINKS from the DOM with XPath
//
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");
//
// go through all the links and store to db or whatever
//
for ($i = 0; $i < $hrefs->length; $i++)
{
$href = $hrefs->item($i);
$url = $href->getAttribute('href');
//if the $url does not contain the web site base address: http://www.thesite.com/ then add it onto the front
$clean_link = clean_url( $base_url, $url, $target_url);
$clean_link = str_replace( "http://" , "" , $clean_link);
$clean_link = str_replace( "//" , "/" , $clean_link);
$links[] = $clean_link;
//removes empty array values
foreach($links as $key => $value)
{
if($value == "")
{
unset($links[$key]);
}
}
$links = array_values($links);
//removes javascript lines
foreach ($links as $key => $value)
{
if ( strpos( $value , "javascript:") !== FALSE )
{
unset($links[$key]);
}
}
$links = array_values($links);
// removes @ lines (email)
foreach ($links as $key => $value)
{
if ( strpos( $value , "@") !== FALSE || strpos( $value, 'mailto:') !== FALSE)
{
unset($links[$key]);
}
}
$links = array_values($links);
}
return $links;
}
What is causing these errors, and how can I prevent them?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您应该使用 set_time_limit 函数设置 max_execution 时间。如果您想要无限的时间(很可能是您的情况),请使用:
You should set the max_execution time using the set_time_limit function. If you want infinite time (most likely your case), use:
原因:某些功能需要超过 30 秒才能完成。
解决方案:增加php配置文件中的最大执行时间(max_execution_time)。
1. 如果您有权访问全局 php.ini 文件(通常位于 /web/conf,否则您可以从 phpinfo 中的配置文件 (php.ini) 路径获取位置),请将 max_execution_time=30 更改为 max_execution_time=300 .
2. 如果您只能访问本地 php.ini 文件(您可以从 phpinfo 中的加载配置文件获取位置),请将 max_execution_time=30 更改为 max_execution_time=300。注意:对于 php 5.x+,该文件名为 php5.ini;对于 4.x,该文件名为 php.ini。
Cause : Some of the functions takes more than 30 seconds to complete.
Solution : Increase the maximum execution time (max_execution_time) in the php configuration file.
1. If you have access to your global php.ini file (usually at /web/conf else you can get the location from Configuration File (php.ini) Path in phpinfo), change max_execution_time=30 to max_execution_time=300.
2. If you have access only to your local php.ini file (you can get the location from Loaded Configuration File in phpinfo), change max_execution_time=30 to max_execution_time=300. Note : this file is named php5.ini for php 5.x+ and php.ini for 4.x.