PHP strpos() 崩溃脚本
我有一个 PHP 脚本,它在使用 CURL_MULTI 函数下载的页面上查找链接。下载很好,我得到了数据,但是当我遇到一个将 url 列为非链接的页面时,我的脚本随机崩溃。这是代码:
$fishnof = strpos($nofresult, $supshorturl, 0);
$return[0] = ''; $return[1] = ''; // always good to cleanset
// Make sure we grabbed a link instead of a text url(no href)
if ($fishnof !== false) {
$linkcheck = rev_strpos($nofresult,'href',$fishnof);
$endthis = false;
while($endthis !== true) {
if($linkcheck > ($fishnof - 25)){ // 19 accounts for href="https://blog. 25 just in case
$endthis = true;
break;
}
$lastfishnof = $fishnof;
$fishnof = strpos($nofresult,$supshorturl,$fishnof+1);
if($fishnof === false){$fishnof = $lastfishnof;$linkcheck = rev_strpos($nofresult,'href',$fishnof);$endthis = true;break;}// This is the last occurance of our URL on this page
if($linkcheck > $fishnof){$linkcheck = rev_strpos($nofresult,'href',$fishnof);$endthis = true;break;} // We went around past the end of the string(probably don't need this)
$linkcheck = rev_strpos($nofresult,'href',$fishnof);
}
if($linkcheck < ($fishnof - 25)){ // 19 accounts for href="https://blog. 25 just in case
$return[0] = 'Non-link.';
$return[1] = '-';
$nofresult = NULL; // Clean up our memory
unset($nofresult); // Clean up our memory
return $return;
}
}
这是自定义的 rev_strpos,它只是执行反向 strpos()
:
// Does a reverse stripos()
function rev_strpos(&$haystack, $needle, $foffset = 0){
$length = strlen($haystack);
$offset = $length - $foffset - 1;
$pos = strpos(strrev($haystack), strrev($needle), $offset);
return ($pos === false)?false:( $length - $pos - strlen($needle) );
}
所以 if:
$nofresult = '
Some text.Some text.Some text.Some text.Some text.Some text.
Some text.Some text.Some text.Some text.Some text.Some text.
Some text.Some text.Some text.Some text.Some text.Some text.
google.com Some text.Some text.Some text.Some text.Some text.
Some text.Some text.Some text.Some text.Some text.Some text.
Some text.Some text.Some text.Some text.Some text.Some text.
<a href="http://www.google.com">Google</a> Some text.Some text.
Some text.Some text.Some text.Some text.Some text.Some text.';
并且
$supshorturl = "google.com";
这应该找到 google.com 第二次出现的位置,它位于HTML href 标签。问题是它在崩溃之前没有报告任何错误,我的错误设置:
ini_set("display_errors", 1);
error_reporting(E_ALL & ~E_NOTICE);
set_error_handler('handle_errors');
我的handle_errors()函数将所有错误记录在文件中。但是,在脚本崩溃之前不会报告任何错误。另外,我的curl_multi处理许多URL,有时它会在某个URL上崩溃,有时它会在另一个URL上崩溃......我准备拔掉我的头发,因为这看起来很容易......但在这里我是。另一点需要注意的是,如果我删除 while 循环,则不会崩溃,而且如果页面首先在 href 标记中包含 url,则不会崩溃。请帮我解决这个问题。谢谢一百万!
I have a PHP script that looks for links on a page that it downloads with CURL_MULTI functions. The downloading is fine and I get the data, but my script randomly crashes when I encounter a page that has the url listed as a nonlink. This is the code:
$fishnof = strpos($nofresult, $supshorturl, 0);
$return[0] = ''; $return[1] = ''; // always good to cleanset
// Make sure we grabbed a link instead of a text url(no href)
if ($fishnof !== false) {
$linkcheck = rev_strpos($nofresult,'href',$fishnof);
$endthis = false;
while($endthis !== true) {
if($linkcheck > ($fishnof - 25)){ // 19 accounts for href="https://blog. 25 just in case
$endthis = true;
break;
}
$lastfishnof = $fishnof;
$fishnof = strpos($nofresult,$supshorturl,$fishnof+1);
if($fishnof === false){$fishnof = $lastfishnof;$linkcheck = rev_strpos($nofresult,'href',$fishnof);$endthis = true;break;}// This is the last occurance of our URL on this page
if($linkcheck > $fishnof){$linkcheck = rev_strpos($nofresult,'href',$fishnof);$endthis = true;break;} // We went around past the end of the string(probably don't need this)
$linkcheck = rev_strpos($nofresult,'href',$fishnof);
}
if($linkcheck < ($fishnof - 25)){ // 19 accounts for href="https://blog. 25 just in case
$return[0] = 'Non-link.';
$return[1] = '-';
$nofresult = NULL; // Clean up our memory
unset($nofresult); // Clean up our memory
return $return;
}
}
This is the custom rev_strpos, which just does a reverse strpos()
:
// Does a reverse stripos()
function rev_strpos(&$haystack, $needle, $foffset = 0){
$length = strlen($haystack);
$offset = $length - $foffset - 1;
$pos = strpos(strrev($haystack), strrev($needle), $offset);
return ($pos === false)?false:( $length - $pos - strlen($needle) );
}
so if:
$nofresult = '
Some text.Some text.Some text.Some text.Some text.Some text.
Some text.Some text.Some text.Some text.Some text.Some text.
Some text.Some text.Some text.Some text.Some text.Some text.
google.com Some text.Some text.Some text.Some text.Some text.
Some text.Some text.Some text.Some text.Some text.Some text.
Some text.Some text.Some text.Some text.Some text.Some text.
<a href="http://www.google.com">Google</a> Some text.Some text.
Some text.Some text.Some text.Some text.Some text.Some text.';
and
$supshorturl = "google.com";
This should find the position of the second occurance of google.com, where it is inside of a HTML href tag. The problem is that it does not report any error before the crash, my error settings:
ini_set("display_errors", 1);
error_reporting(E_ALL & ~E_NOTICE);
set_error_handler('handle_errors');
My handle_errors()
function logs all errors in a file. However no errors are reported before the script crashes. Also my curl_multi processes many URLs, and sometimes it will crash on a certain URL and and other times it crashes on another URL... I am ready to pull out my hair because this seems like such an easy deal... but here I am. Another point of notice is if I remove the while loop then no crash, also if the page has the url in a href tag first then it doesn't crash. Please help me figure this thing out. Thanks a million!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为你让事情变得比需要的更加困难。如果仅需要
rev_strpos
返回搜索字符串的最后一个实例,并且您不担心大小写,请改用strripos
。从 PHP 文档...
如果您需要它区分大小写,或者只是出于某种原因想使用您自己的函数,那么问题在于您如何计算偏移量。具体在这两行中:
使用示例“Some text...”并搜索“google.com”,如果我们不指定偏移量,它将计算偏移量为长度(500 个字符)- 偏移量(0 个字符)- 1. 然后,您对从偏移字符 499 开始的 500 个字符长度的字符串使用 strpos。您永远不会以这种方式找到任何内容。
由于您要反转大海捞针和针,因此您需要“反转”偏移量。将行更改为:(
实际上,您应该更改之前的行以计算您想要的 $offset,但您明白了...)
更新:
除了关于使用正则表达式的建议之外,获取位置确实很简单:
这确实应该比
rev_strpos
& 更好。while
循环花招。I think you're making it harder than it needs to be. If
rev_strpos
is only needed to return the last instance of your search string, and if you aren't worried about case, usestrripos
instead.From the PHP docs...
If you need it to be case-sensitive, or just want to use your own function for some reason, the problem is in how you are calculating the offset. Specifically in these 2 lines:
Using your sample "Some text..." and searching for "google.com", if we don't specify an offset it calculates the offset as length (500 chars) - offset (0 chars) - 1. Then you use strpos on a 500-char length string starting at offset character 499. You're never going to find anything that way.
Since you are reversing your haystack and also your needle, you need to "reverse" your offset. Change the line to:
(Actually, you should change your prior line to calculate the $offset where you want it to be, but you get the point...)
UPDATE:
Further to the recommendations about using Regex, it's really trivial to get locations:
This really should be preferable to the
rev_strpos
&while
loop gimmicks.问题是这个解析错误
......它应该是
Problem is this parse error
... it should be