preg_replace 抛出段错误

发布于 2024-10-02 03:11:24 字数 478 浏览 2 评论 0原文

当我执行以下代码时;我每次都会遇到段错误!这是一个已知的错误吗?我怎样才能使这段代码工作?

<?php
$doc = file_get_contents("http://prairieprogressive.com/");
$replace = array(
    "/<script([\s\S])*?<\/ ?script>/",
    "/<style([\s\S])*?<\/ ?style>/",
    "/<!--([\s\S])*?-->/",
    "/\r\n/"
);
$doc = preg_replace($replace,"",$doc);
echo $doc;
?>

错误(显然)看起来像:

[root@localhost 2.0]# php test.php
Segmentation fault (core dumped)

When I execute the following code; I get a seg fault every time! Is this a known bug? How can I make this code work?

<?php
$doc = file_get_contents("http://prairieprogressive.com/");
$replace = array(
    "/<script([\s\S])*?<\/ ?script>/",
    "/<style([\s\S])*?<\/ ?style>/",
    "/<!--([\s\S])*?-->/",
    "/\r\n/"
);
$doc = preg_replace($replace,"",$doc);
echo $doc;
?>

The error (obviously) looks like:

[root@localhost 2.0]# php test.php
Segmentation fault (core dumped)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

瀟灑尐姊 2024-10-09 03:11:24

您有不必要的捕获组,这会给 PCRE 的回溯带来压力。试试这个:

$replace = array(
    "/<script.*?><\/\s?script>/s",
    "/<style.*?><\/\s?style>/s",
    "/<!--.*?-->/s",
    "/\r\n/s"
);

另一件事是, \s (空白)与 \S (非空白)相结合可以匹配任何内容。因此只需使用 . 模式即可。

You have unnecessary capture groups that strain PCRE's backtracking. Try this:

$replace = array(
    "/<script.*?><\/\s?script>/s",
    "/<style.*?><\/\s?style>/s",
    "/<!--.*?-->/s",
    "/\r\n/s"
);

Another thing, \s (whitespace) combined with \S (non-whitespace) matches anything. So just use the . pattern.

太阳公公是暖光 2024-10-09 03:11:24

好的!似乎 () 运算符存在一些问题...

当我使用

$doc = preg_replace("/<style([\s\S]*)<\/ ?style>/",'',$doc);

而不是

$doc = preg_replace("/<style([\s\S])*<\/ ?style>/",'',$doc);

它时有效!

OK! It seems like there is some issue with the () operators...

When I use

$doc = preg_replace("/<style([\s\S]*)<\/ ?style>/",'',$doc);

instead of

$doc = preg_replace("/<style([\s\S])*<\/ ?style>/",'',$doc);

it works!!

魔法唧唧 2024-10-09 03:11:24

这似乎是一个错误。

正如您在评论中提到的,正是样式正则表达式导致了这种情况。作为解决方法,您可以使用 s 修饰符,以便 . 甚至匹配换行符:

$doc = preg_replace("/<style.*?<\/ ?style>/s",'',$doc);

This seems to be a bug.

As mentioned by you in the comment, it is the style regex that is causing this. As a workaround you can use the s modifier so that . matches even the newline:

$doc = preg_replace("/<style.*?<\/ ?style>/s",'',$doc);
别再吹冷风 2024-10-09 03:11:24

试试这个(为 unicode 添加了选项 u 并将 ([\s\S])? 更改为 .? :

<?php
$doc = file_get_contents("http://prairieprogressive.com/");
$replace = array(
    "#<script.*?</ ?script>#u",
    '#<style.*?</ ?style>#u',
    "#<!--.*?-->#u",
    "#\r\n#u"
);
$doc = preg_replace($replace,"",$doc);
echo $doc;
?>

Try this (added option u for unicode and changed ([\s\S])? to .? :

<?php
$doc = file_get_contents("http://prairieprogressive.com/");
$replace = array(
    "#<script.*?</ ?script>#u",
    '#<style.*?</ ?style>#u',
    "#<!--.*?-->#u",
    "#\r\n#u"
);
$doc = preg_replace($replace,"",$doc);
echo $doc;
?>
临走之时 2024-10-09 03:11:24

[\s\S] 的意义是什么?它匹配任何空白字符和任何非空白字符。如果将其替换为 .*,它就可以正常工作。

编辑:如果您也想匹配新行,请使用 s 修饰符。在我看来,它比矛盾的[\s\S]更容易理解。

What is the point of [\s\S]? It matches any whitespace character, and any non-whitespace character. If you replace it with .*, it works just fine.

EDIT: If you want to match new lines too, use the s modifier. In my opinion, it is easier to understand than a contradictory [\s\S].

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文