使用 preg_replace_callback 函数替换结束 div 标签

发布于 2024-12-25 05:26:40 字数 2090 浏览 1 评论 0原文

我正在尝试开发一个 PHP 脚本,将 HTML 字符串中的所有 div 替换为段落,除了那些具有属性的段落(例如

)。我的脚本当前要做的第一件事是使用简单的 str_replace() 将所有出现的
替换为

,这会留下所有 div 标签带有属性和结束 div 标签 (

)。但是,将
标记替换为

标记会出现一些问题。

到目前为止,我已经开发了一个 preg_replace_callback 函数,旨在将一些 标签转换为

标签以匹配开头的 < ;p> 标签,但当其他 标签以属性结尾

时,忽略它们。下面是我正在使用的脚本;

<?php
$input = "<div>Hello world!</div><div><div id=\"1\">How <div>are you</div> today?</div></div><div>I am fine.</div>";
$input2 = str_replace("<div>", "<p>", $input);
$output = preg_replace_callback("/(<div )|(<\/div>)/", 'replacer', $input2);

function replacer($matches){
    static $count = 0;
    $counter=count($matches);
    for($i=0;$i<$counter;$i++){
        if($matches[$i]=="<div "){
            return "<div ";
            $count++;
        } elseif ($matches[$i]=="</div>"){
            $count--;
            if ($count>=0){
                return "</div>";
            } elseif ($count<0){
                return "</p>";
                $count++;
            }
        }
    }
}
echo $output;
?>

该脚本基本上将所有剩余的

标记放入一个数组中,然后循环遍历它。然后,计数器变量在遇到
标记时递增,或者在数组内遇到
时递减。当计数器小于0时,返回

标签,否则返回。 脚本的输出应该是;

<p>Hello world!</p><p><div id="1">How <p>are you</p> today?</div></p><p>I am fine.</p>"

相反,我得到的输出是;

<p>Hello world!</p><p><div id="1">How <p>are you</p> today?</p></p><p>I am fine.</p>

我花了几个小时对脚本进行了尽可能多的编辑,但我一直得到相同的输出。谁能向我解释我哪里出了问题或提供替代解决方案?

任何帮助将不胜感激。

I am trying to develop a PHP script that replaces all divs in an HTML string with paragraphs except those which have attributes (e.g. <div id="1">). The first thing my script currently does is use a simple str_replace() to replace all occurrences of <div> with <p>, and this leaves behind any div tags with attributes and end div tags (</div>). However, replacing the </div> tags with </p> tags is a bit more problematic.

So far, I have developed a preg_replace_callback function that is designed to convert some </div> tags into </p> tags to match the opening <p> tags, but ignore other </div> tags when they are ending a <div> with attributes. Below is the script that I am using;

<?php
$input = "<div>Hello world!</div><div><div id=\"1\">How <div>are you</div> today?</div></div><div>I am fine.</div>";
$input2 = str_replace("<div>", "<p>", $input);
$output = preg_replace_callback("/(<div )|(<\/div>)/", 'replacer', $input2);

function replacer($matches){
    static $count = 0;
    $counter=count($matches);
    for($i=0;$i<$counter;$i++){
        if($matches[$i]=="<div "){
            return "<div ";
            $count++;
        } elseif ($matches[$i]=="</div>"){
            $count--;
            if ($count>=0){
                return "</div>";
            } elseif ($count<0){
                return "</p>";
                $count++;
            }
        }
    }
}
echo $output;
?>

The script basically puts all the remaining <div> and </div> tags into an array and then loop through it. A counter variable is then incremented when it encounters a <div> tag or decremented when it encounters a </div> within the array. When the counter is less than 0, a </p> tag is returned, otherwise a </div> is returned.
The output of the script should be;

<p>Hello world!</p><p><div id="1">How <p>are you</p> today?</div></p><p>I am fine.</p>"

Instead the output I am getting is;

<p>Hello world!</p><p><div id="1">How <p>are you</p> today?</p></p><p>I am fine.</p>

I have spent hours making as many edits to the script as I can think of, and I keep getting the same output. Can anyone explain to me where I am going wrong or offer an alternative solution?

Any help would be appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

谈场末日恋爱 2025-01-01 05:26:40

在 mario 评论的旁边,与 phpquery 或 querypath 类似,您可以使用 PHP DOMDocument 类来搜索有问题的

元素并将其替换为

元素。

基石是 DOM(文档对象模型)和 XPath:

$input = "<div>Hello world!</div><div><div id=\"1\">How <div>are you</div> today?</div></div><div>I am fine.</div>";

$doc = new DOMDocument();
$doc->loadHTML("<div id='body'>{$input}</div>");
$root = $doc->getElementById('body');
$xp = new DOMXPath($doc);

$expression = './/div[not(@id)]';

while($r = $xp->query($expression, $root) and $r->length)
    foreach($r as $div)
    {
        $new = $doc->createElement('p');
        foreach($div->childNodes as $child)
            $new->appendChild($child->cloneNode(1));

        $div->parentNode->replaceChild($new, $div);
    }
    ;

$html = '';
foreach($root->childNodes as $child)
    $html .= rtrim($doc->saveHTML($child))
    ;

echo $html;

这将为您提供:

<p>Hello world!</p><p><div id="1">How <p>are you</p> today?</div></p><p>I am fine.</p>

Next to what mario commented, comparable to phpquery or querypath, you can use the PHP DOMDocument class to search for the <div> elements in question and replace them with <p> elements.

The cornerstones are the DOM (Document Object Model) and XPath:

$input = "<div>Hello world!</div><div><div id=\"1\">How <div>are you</div> today?</div></div><div>I am fine.</div>";

$doc = new DOMDocument();
$doc->loadHTML("<div id='body'>{$input}</div>");
$root = $doc->getElementById('body');
$xp = new DOMXPath($doc);

$expression = './/div[not(@id)]';

while($r = $xp->query($expression, $root) and $r->length)
    foreach($r as $div)
    {
        $new = $doc->createElement('p');
        foreach($div->childNodes as $child)
            $new->appendChild($child->cloneNode(1));

        $div->parentNode->replaceChild($new, $div);
    }
    ;

$html = '';
foreach($root->childNodes as $child)
    $html .= rtrim($doc->saveHTML($child))
    ;

echo $html;

This will give you:

<p>Hello world!</p><p><div id="1">How <p>are you</p> today?</div></p><p>I am fine.</p>
小红帽 2025-01-01 05:26:40

我对多个正则表达式采取了不同的方法:

$text = "<div>Hello world!</div><div><div id=\"1\">How <div>are you</div> today?</div></div><div>I am fine.</div><div>an other <div id=\"2\">small</div>test</div><div>nested<div>divs</div>...</div>";
echo "before: " . $text . "\n";

do
{
    $count1 = 0;
    $text = preg_replace("/<div>((?![^<]*?<div).*?)<\/div>/", "<p>$1</p>", $text, -1, $count1);
    $count2 = 0;
    $text = preg_replace("/<div ([^>]+)>((?![^<]*?<div).*?)<\/div>/", "<temporarytag $1>$2</temporarytag>", $text, -1, $count);
} while ($count1 + $count2 > 0);

$text = preg_replace("/(<[\/]?)temporarytag/", "$1div", $text);

echo "after: " . $text;

这会给你带来:

    before: <div>Hello world!</div><div><div id="1">How <div>are you</div> today?</div></div><div>I am fine.</div><div>an other <div id="2">small</div>test</div><div>nested<div>divs</div>...</div>
    after: <p>Hello world!</p><p><div id="1">How <p>are you</p> today?</div></p><p>I am fine.</p><p>an other <div id="2">small</div>test</p><p>nested<p>divs</p>...</p>

如果你不需要代码片段,我至少自己学到了一些关于正则表达式的知识:P

I took a different approach with multiple regular expressions:

$text = "<div>Hello world!</div><div><div id=\"1\">How <div>are you</div> today?</div></div><div>I am fine.</div><div>an other <div id=\"2\">small</div>test</div><div>nested<div>divs</div>...</div>";
echo "before: " . $text . "\n";

do
{
    $count1 = 0;
    $text = preg_replace("/<div>((?![^<]*?<div).*?)<\/div>/", "<p>$1</p>", $text, -1, $count1);
    $count2 = 0;
    $text = preg_replace("/<div ([^>]+)>((?![^<]*?<div).*?)<\/div>/", "<temporarytag $1>$2</temporarytag>", $text, -1, $count);
} while ($count1 + $count2 > 0);

$text = preg_replace("/(<[\/]?)temporarytag/", "$1div", $text);

echo "after: " . $text;

This will get you:

    before: <div>Hello world!</div><div><div id="1">How <div>are you</div> today?</div></div><div>I am fine.</div><div>an other <div id="2">small</div>test</div><div>nested<div>divs</div>...</div>
    after: <p>Hello world!</p><p><div id="1">How <p>are you</p> today?</div></p><p>I am fine.</p><p>an other <div id="2">small</div>test</p><p>nested<p>divs</p>...</p>

If you don't need the snippet, I have learned something about regexp's myself at least :P

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文