当前位置：文江博客话题详情

preg_replace 替换不正确

发布于 2024-11-26 19:24:22 字数 2624 浏览 1 评论 0 原文

我从一位朋友（他现在外出度假）那里得到了一些帮助，但我在 preg_replace 搜索和替换方面遇到了问题。我不知道为什么，但它错误地替换了字符串，这对它应该替换的下一个字符串产生了连锁影响。

这基本上是在模板类中处理模板内的“if”和“else”查询。

function if_statement($a, $b, $if, $type, $else = NULL){
    if($type == "1" && is_numeric($a) && is_numeric($b)){
        $statement = ($a === $b) ? $if : $else;
    } else if($type == "1"){
        $statement = ($a == $b) ? $if : $else;
    } else if($type == "2"){
        $statement = ($a != $b) ? $if : $else;
    }
    return stripslashes($statement);
}

$output = file_get_contents("template.tpl");

$replace = array(
  '#\<if:"\'(.*?)\' == \'(.*?)\'"\>(.*?)\<else\>(.*?)\<\/endif\>#sei',
  '#\<if:"\'(.*?)\' == \'(.*?)\'"\>(.*?)\<\/endif\>#sei'
);  
$functions = array(
  "if_statement('\\1', '\\2', '\\3', '1', '\\4')",
  "if_statement('\\1', '\\2', '\\3', '1')"
);
$output = preg_replace($replace, $functions, $output);
echo $output;

模板：

<HTML>
    <head>
    <meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
    <title>Site Title</title>
    <link rel="stylesheet" type="text/css" media="screen" href="common.css" />
    <if:"'{ISADMIN}' == '1'">
        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday<else>Today is not Monday</endif>
    <if:"'1' == '2'">1 equals 2!<else>1 doesn't equal 2</endif>
</body>
</html>

当前输出如下：

<HTML>
    <head>
    <meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
    <title>Site Title</title>
    <link rel="stylesheet" type="text/css" media="screen" href="common.css" />

        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    **</endif>**
</head>
<body>
    **<if:"'{TODAY}' == 'Monday'">**Today is Monday
    1 doesn't equal 2
</body>
</html>

在上面，粗体/astrix makred 部分不应该出现在输出中，而且今天不是星期一。当管理员登录时，admin-bar.css 文件已正确包含，但由于某种原因未拾取标记 -事实上，它看起来像是在下一个语句中位于标记之后......换句话说，preg_replace 匹配了一个不正确的东西！因此没有接受第二个语句。

{BRACKET} 标签被正确替换 - 我什至手动将数据放入语句中（只是为了检查），所以它们不是问题......

我不知道为什么，但对我来说，preg_replace 没有找到正确的顺序来替换和执行操作。如果有人能提供一双新的眼睛/伸出援助之手，我将不胜感激。

谢谢！

原文

I've had some help from a friend (who is now away on holiday) but I have a problem with a preg_replace search and replace. I don't know why, but it is replacing strings incorrectly which has had a knock on affect to the next one it should replace.

This basically goes within a template class dealing with 'if' and 'else' queries within the template.

function if_statement($a, $b, $if, $type, $else = NULL){
    if($type == "1" && is_numeric($a) && is_numeric($b)){
        $statement = ($a === $b) ? $if : $else;
    } else if($type == "1"){
        $statement = ($a == $b) ? $if : $else;
    } else if($type == "2"){
        $statement = ($a != $b) ? $if : $else;
    }
    return stripslashes($statement);
}

$output = file_get_contents("template.tpl");

$replace = array(
  '#\<if:"\'(.*?)\' == \'(.*?)\'"\>(.*?)\<else\>(.*?)\<\/endif\>#sei',
  '#\<if:"\'(.*?)\' == \'(.*?)\'"\>(.*?)\<\/endif\>#sei'
);  
$functions = array(
  "if_statement('\\1', '\\2', '\\3', '1', '\\4')",
  "if_statement('\\1', '\\2', '\\3', '1')"
);
$output = preg_replace($replace, $functions, $output);
echo $output;

The template:

<HTML>
    <head>
    <meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
    <title>Site Title</title>
    <link rel="stylesheet" type="text/css" media="screen" href="common.css" />
    <if:"'{ISADMIN}' == '1'">
        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday<else>Today is not Monday</endif>
    <if:"'1' == '2'">1 equals 2!<else>1 doesn't equal 2</endif>
</body>
</html>

Where the current output will be below:

<HTML>
    <head>
    <meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
    <title>Site Title</title>
    <link rel="stylesheet" type="text/css" media="screen" href="common.css" />

        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    **</endif>**
</head>
<body>
    **<if:"'{TODAY}' == 'Monday'">**Today is Monday
    1 doesn't equal 2
</body>
</html>

In the above, the bolded/astrix makred parts shouldn't be there on the output, and also today isn't Monday. While the admin is logged in, the admin-bar.css file has been rightly included, but for some reason isn't picking up the </endif> tag - infact, it looks like it has gone after the <else> tag instead in the next statement ... in other words, preg_replace has matched an incorrect thing! And thus didn't pick up on the 2nd <if> statement.

The {BRACKET} tags are being replaced correctly - I've even manually put data into the statement (just to check), so they aren't the problem...

I don't know why, but to me preg_replace isn't finding the correct sequence to replace and act upon. If anyone could lay a fresh pair of eyes/lend a hand, I would be grateful.

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

遗失的美好 2024-12-03 19:24:22

示例中的第一个没有子句。因此，当 (.*?)(.*?)< /code> （其中不是可选的）应用于它，它匹配所有这些：

    <if:"'{ISADMIN}' == '1'">
        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday<else>Today is not Monday</endif>

在该匹配中，组 $3 是

        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday

您可以通过禁止来避免这种情况这正则表达式使用前瞻断言来跨越：

'%<if:\s*"\'([^\']*)\' == \'([^\']*)\'">((?:(?!<else>|</endif>).)*)<else>((?:(?!</endif).)*)</endif>%si'

或者以注释形式（当程序员再次“外出度假”时可能更有帮助）：

'%<if:\s*"\'     # Match <if:(optional space)"\'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 1
    \'\s==\s\'   # Match \' == \'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 2
    \'">         # Match \'">
    (            # Capture into group 3:
     (?:         # The following group...
      (?!        # only if we\'re not right before...
       <else>    # <else>
      |          # or
       </endif>  # </endif>
      )          # (End of lookahead assertion)
      .          # Match any character
     )*          # Repeat as necessary
    )            # End of capturing group 3
    <else>       # Match <else>
    (            # Same construction as above, group 4
     (?:
      (?!
       </endif>  # this time only looking for </endif>
      )
      .
     )*
    )
    </endif>     # and finally match </endif>
    %esix'

第二个正则表达式也应该得到改进：

'%<if:\s*"\'     # Match <if:(optional space)"\'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 1
    \'\s==\s\'   # Match \' == \'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 2
    \'">         # Match \'">
    (            # Capture into group 3:
     (?:
      (?!
       </endif>  # Any text until </endif>
      )
      .
     )*
    )
    </endif>     # and finally match </endif>
    %esix'

此外，这些正则表达式应该更快，因为它们更清楚地指定可以匹配和不能匹配的内容，从而避免大量回溯。

The first <if> in your sample doesn't have an <else> clause. Therefore, when <if:"'(.*?)' == '(.*?)'">(.*?)<else>(.*?)</endif> (where <else> is not optional) is applied to it, it matches all this:

    <if:"'{ISADMIN}' == '1'">
        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday<else>Today is not Monday</endif>

In that match, group $3 is

        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday

You could avoid that by forbidding the regex to cross over an </endif> using lookahead assertions:

'%<if:\s*"\'([^\']*)\' == \'([^\']*)\'">((?:(?!<else>|</endif>).)*)<else>((?:(?!</endif).)*)</endif>%si'

or, in commented form (and possibly more helpful when a programmer again goes "away on holiday"):

'%<if:\s*"\'     # Match <if:(optional space)"\'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 1
    \'\s==\s\'   # Match \' == \'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 2
    \'">         # Match \'">
    (            # Capture into group 3:
     (?:         # The following group...
      (?!        # only if we\'re not right before...
       <else>    # <else>
      |          # or
       </endif>  # </endif>
      )          # (End of lookahead assertion)
      .          # Match any character
     )*          # Repeat as necessary
    )            # End of capturing group 3
    <else>       # Match <else>
    (            # Same construction as above, group 4
     (?:
      (?!
       </endif>  # this time only looking for </endif>
      )
      .
     )*
    )
    </endif>     # and finally match </endif>
    %esix'

The second regex should also be improved:

'%<if:\s*"\'     # Match <if:(optional space)"\'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 1
    \'\s==\s\'   # Match \' == \'
    ([^\']*)     # Match 0 or more non-quote characters, capture group 2
    \'">         # Match \'">
    (            # Capture into group 3:
     (?:
      (?!
       </endif>  # Any text until </endif>
      )
      .
     )*
    )
    </endif>     # and finally match </endif>
    %esix'

Also, these regexes should be faster as they specify more clearly what can and cannot be matched, thus avoiding a lot of backtracking.

回复收藏 0 原文

~没有更多了~

关于作者

杯别

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

preg_replace 替换不正确

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

preg_replace 替换不正确

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。