preg_replace 替换不正确

发布于 2024-11-26 19:24:22 字数 2624 浏览 1 评论 0 原文

我从一位朋友(他现在外出度假)那里得到了一些帮助,但我在 preg_replace 搜索和替换方面遇到了问题。我不知道为什么,但它错误地替换了字符串,这对它应该替换的下一个字符串产生了连锁影响。

这基本上是在模板类中处理模板内的“if”和“else”查询。

function if_statement($a, $b, $if, $type, $else = NULL){
    if($type == "1" && is_numeric($a) && is_numeric($b)){
        $statement = ($a === $b) ? $if : $else;
    } else if($type == "1"){
        $statement = ($a == $b) ? $if : $else;
    } else if($type == "2"){
        $statement = ($a != $b) ? $if : $else;
    }
    return stripslashes($statement);
}

$output = file_get_contents("template.tpl");

$replace = array(
  '#\<if:"\'(.*?)\' == \'(.*?)\'"\>(.*?)\<else\>(.*?)\<\/endif\>#sei',
  '#\<if:"\'(.*?)\' == \'(.*?)\'"\>(.*?)\<\/endif\>#sei'
);  
$functions = array(
  "if_statement('\\1', '\\2', '\\3', '1', '\\4')",
  "if_statement('\\1', '\\2', '\\3', '1')"
);
$output = preg_replace($replace, $functions, $output);
echo $output;

模板:

<HTML>
    <head>
    <meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
    <title>Site Title</title>
    <link rel="stylesheet" type="text/css" media="screen" href="common.css" />
    <if:"'{ISADMIN}' == '1'">
        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday<else>Today is not Monday</endif>
    <if:"'1' == '2'">1 equals 2!<else>1 doesn't equal 2</endif>
</body>
</html>

当前输出如下:

<HTML>
    <head>
    <meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
    <title>Site Title</title>
    <link rel="stylesheet" type="text/css" media="screen" href="common.css" />

        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    **</endif>**
</head>
<body>
    **<if:"'{TODAY}' == 'Monday'">**Today is Monday
    1 doesn't equal 2
</body>
</html>

在上面,粗体/astrix makred 部分不应该出现在输出中,而且今天不是星期一。当管理员登录时,admin-bar.css 文件已正确包含,但由于某种原因未拾取 标记 -事实上,它看起来像是在下一个语句中位于 标记之后......换句话说,preg_replace 匹配了一个不正确的东西!因此没有接受第二个 语句。

{BRACKET} 标签被正确替换 - 我什至手动将数据放入语句中(只是为了检查),所以它们不是问题......

我不知道为什么,但对我来说,preg_replace 没有找到正确的顺序来替换和执行操作。如果有人能提供一双新的眼睛/伸出援助之手,我将不胜感激。

谢谢!

I've had some help from a friend (who is now away on holiday) but I have a problem with a preg_replace search and replace. I don't know why, but it is replacing strings incorrectly which has had a knock on affect to the next one it should replace.

This basically goes within a template class dealing with 'if' and 'else' queries within the template.

function if_statement($a, $b, $if, $type, $else = NULL){
    if($type == "1" && is_numeric($a) && is_numeric($b)){
        $statement = ($a === $b) ? $if : $else;
    } else if($type == "1"){
        $statement = ($a == $b) ? $if : $else;
    } else if($type == "2"){
        $statement = ($a != $b) ? $if : $else;
    }
    return stripslashes($statement);
}

$output = file_get_contents("template.tpl");

$replace = array(
  '#\<if:"\'(.*?)\' == \'(.*?)\'"\>(.*?)\<else\>(.*?)\<\/endif\>#sei',
  '#\<if:"\'(.*?)\' == \'(.*?)\'"\>(.*?)\<\/endif\>#sei'
);  
$functions = array(
  "if_statement('\\1', '\\2', '\\3', '1', '\\4')",
  "if_statement('\\1', '\\2', '\\3', '1')"
);
$output = preg_replace($replace, $functions, $output);
echo $output;

The template:

<HTML>
    <head>
    <meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
    <title>Site Title</title>
    <link rel="stylesheet" type="text/css" media="screen" href="common.css" />
    <if:"'{ISADMIN}' == '1'">
        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday<else>Today is not Monday</endif>
    <if:"'1' == '2'">1 equals 2!<else>1 doesn't equal 2</endif>
</body>
</html>

Where the current output will be below:

<HTML>
    <head>
    <meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
    <title>Site Title</title>
    <link rel="stylesheet" type="text/css" media="screen" href="common.css" />

        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    **</endif>**
</head>
<body>
    **<if:"'{TODAY}' == 'Monday'">**Today is Monday
    1 doesn't equal 2
</body>
</html>

In the above, the bolded/astrix makred parts shouldn't be there on the output, and also today isn't Monday. While the admin is logged in, the admin-bar.css file has been rightly included, but for some reason isn't picking up the </endif> tag - infact, it looks like it has gone after the <else> tag instead in the next statement ... in other words, preg_replace has matched an incorrect thing! And thus didn't pick up on the 2nd <if> statement.

The {BRACKET} tags are being replaced correctly - I've even manually put data into the statement (just to check), so they aren't the problem...

I don't know why, but to me preg_replace isn't finding the correct sequence to replace and act upon. If anyone could lay a fresh pair of eyes/lend a hand, I would be grateful.

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

遗失的美好 2024-12-03 19:24:22

示例中的第一个 没有 子句。因此,当 (.*?)(.*?)< /code> (其中 不是可选的)应用于它,它匹配所有这些:

    <if:"'{ISADMIN}' == '1'">
        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday<else>Today is not Monday</endif>

在该匹配中,组 $3

        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday

您可以通过禁止来避免这种情况这正则表达式使用前瞻断言来跨越

'%<if:\s*"\'([^\']*)\' == \'([^\']*)\'">((?:(?!<else>|</endif>).)*)<else>((?:(?!</endif).)*)</endif>%si'

或者以注释形式(当程序员再次“外出度假”时可能更有帮助):

'%<if:\s*"\'     # Match <if:(optional space)"\'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 1
    \'\s==\s\'   # Match \' == \'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 2
    \'">         # Match \'">
    (            # Capture into group 3:
     (?:         # The following group...
      (?!        # only if we\'re not right before...
       <else>    # <else>
      |          # or
       </endif>  # </endif>
      )          # (End of lookahead assertion)
      .          # Match any character
     )*          # Repeat as necessary
    )            # End of capturing group 3
    <else>       # Match <else>
    (            # Same construction as above, group 4
     (?:
      (?!
       </endif>  # this time only looking for </endif>
      )
      .
     )*
    )
    </endif>     # and finally match </endif>
    %esix'

第二个正则表达式也应该得到改进:

'%<if:\s*"\'     # Match <if:(optional space)"\'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 1
    \'\s==\s\'   # Match \' == \'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 2
    \'">         # Match \'">
    (            # Capture into group 3:
     (?:
      (?!
       </endif>  # Any text until </endif>
      )
      .
     )*
    )
    </endif>     # and finally match </endif>
    %esix'

此外,这些正则表达式应该更快,因为它们更清楚地指定可以匹配和不能匹配的内容,从而避免大量回溯。

The first <if> in your sample doesn't have an <else> clause. Therefore, when <if:"'(.*?)' == '(.*?)'">(.*?)<else>(.*?)</endif> (where <else> is not optional) is applied to it, it matches all this:

    <if:"'{ISADMIN}' == '1'">
        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday<else>Today is not Monday</endif>

In that match, group $3 is

        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday

You could avoid that by forbidding the regex to cross over an </endif> using lookahead assertions:

'%<if:\s*"\'([^\']*)\' == \'([^\']*)\'">((?:(?!<else>|</endif>).)*)<else>((?:(?!</endif).)*)</endif>%si'

or, in commented form (and possibly more helpful when a programmer again goes "away on holiday"):

'%<if:\s*"\'     # Match <if:(optional space)"\'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 1
    \'\s==\s\'   # Match \' == \'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 2
    \'">         # Match \'">
    (            # Capture into group 3:
     (?:         # The following group...
      (?!        # only if we\'re not right before...
       <else>    # <else>
      |          # or
       </endif>  # </endif>
      )          # (End of lookahead assertion)
      .          # Match any character
     )*          # Repeat as necessary
    )            # End of capturing group 3
    <else>       # Match <else>
    (            # Same construction as above, group 4
     (?:
      (?!
       </endif>  # this time only looking for </endif>
      )
      .
     )*
    )
    </endif>     # and finally match </endif>
    %esix'

The second regex should also be improved:

'%<if:\s*"\'     # Match <if:(optional space)"\'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 1
    \'\s==\s\'   # Match \' == \'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 2
    \'">         # Match \'">
    (            # Capture into group 3:
     (?:
      (?!
       </endif>  # Any text until </endif>
      )
      .
     )*
    )
    </endif>     # and finally match </endif>
    %esix'

Also, these regexes should be faster as they specify more clearly what can and cannot be matched, thus avoiding a lot of backtracking.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文