如何使用标签的 id 剥离标签及其所有内部 html?

发布于 2024-09-11 03:02:51 字数 400 浏览 8 评论 0原文

我有以下 html:

<html>
 <body>
 bla bla bla bla
  <div id="myDiv"> 
         more text
      <div id="anotherDiv">
           And even more text
      </div>
  </div>

  bla bla bla
 </body>
</html>

我想删除从

开始直到结束
的所有内容。我该怎么做?

I have the following html:

<html>
 <body>
 bla bla bla bla
  <div id="myDiv"> 
         more text
      <div id="anotherDiv">
           And even more text
      </div>
  </div>

  bla bla bla
 </body>
</html>

I want to remove everything starting from <div id="anotherDiv"> until its closing <div>. How do I do that?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

油饼 2024-09-18 03:02:51

使用 原生 DOM

$dom = new DOMDocument;
$dom->loadHTML($htmlString);
$xPath = new DOMXPath($dom);
$nodes = $xPath->query('//*[@id="anotherDiv"]');
if($nodes->item(0)) {
    $nodes->item(0)->parentNode->removeChild($nodes->item(0));
}
echo $dom->saveHTML();

With native DOM

$dom = new DOMDocument;
$dom->loadHTML($htmlString);
$xPath = new DOMXPath($dom);
$nodes = $xPath->query('//*[@id="anotherDiv"]');
if($nodes->item(0)) {
    $nodes->item(0)->parentNode->removeChild($nodes->item(0));
}
echo $dom->saveHTML();
北笙凉宸 2024-09-18 03:02:51

您可以使用 preg_replace() ,例如:

$string = preg_replace('/<div id="someid"[^>]+\>/i', "", $string);

You can use preg_replace() like:

$string = preg_replace('/<div id="someid"[^>]+\>/i', "", $string);
任谁 2024-09-18 03:02:51

使用本机 XML 操作库

假设您的 html 内容存储在变量 $html:

$html='<html>
 <body>
 bla bla bla bla
  <div id="myDiv"> 
         more text
      <div id="anotherDiv">
           And even more text
      </div>
  </div>

  bla bla bla
 </body>
</html>';

要按 ID 删除标记,请使用以下代码:

    $dom=new DOMDocument;

    $dom->validateOnParse = false;

    $dom->loadHTML( $html );

    // get the tag

    $div = $dom->getElementById('anotherDiv');

   // delete the tag

    if( $div && $div->nodeType==XML_ELEMENT_NODE ){

        $div->parentNode->removeChild( $div );
    }

    echo $dom->saveHTML();

请注意,某些版本的 libxml 需要存在 doctype 才能使用 getElementById 方法。

在这种情况下,您可以在 $html 前面加上

$html = '<!doctype>' . $html;

或者,按照 Gordon 的答案建议,您可以使用 DOMXPath 使用 xpath 查找元素

$dom=new DOMDocument;

$dom->validateOnParse = false;

$dom->loadHTML( $html );

$xp=new DOMXPath( $dom );

$col = $xp->query( '//div[ @id="anotherDiv" ]' );

if( !empty( $col ) ){

    foreach( $col as $node ){

        $node->parentNode->removeChild( $node );

    }

}

echo $dom->saveHTML();

:无论标签如何,该方法都有效。如果您想使用具有相同 id 但不同标签的第二种方法,例如 form,只需将 //div[ @id> 中的 //div 替换即可="anotherDiv"] by '//form'

Using the native XML Manipulation Library

Assuming that your html content is stored in the variable $html:

$html='<html>
 <body>
 bla bla bla bla
  <div id="myDiv"> 
         more text
      <div id="anotherDiv">
           And even more text
      </div>
  </div>

  bla bla bla
 </body>
</html>';

To delete the tag by ID use the following code:

    $dom=new DOMDocument;

    $dom->validateOnParse = false;

    $dom->loadHTML( $html );

    // get the tag

    $div = $dom->getElementById('anotherDiv');

   // delete the tag

    if( $div && $div->nodeType==XML_ELEMENT_NODE ){

        $div->parentNode->removeChild( $div );
    }

    echo $dom->saveHTML();

Note that certain versions of libxml require a doctype to be present in order to use the getElementById method.

In that case you can prepend $html with <!doctype>

$html = '<!doctype>' . $html;

Alternatively, as suggested by Gordon's answer, you can use DOMXPath to find the element using the xpath:

$dom=new DOMDocument;

$dom->validateOnParse = false;

$dom->loadHTML( $html );

$xp=new DOMXPath( $dom );

$col = $xp->query( '//div[ @id="anotherDiv" ]' );

if( !empty( $col ) ){

    foreach( $col as $node ){

        $node->parentNode->removeChild( $node );

    }

}

echo $dom->saveHTML();

The first method works regardless the tag. If you want to use the second method with the same id but a different tag, let say form, simply replace //div in //div[ @id="anotherDiv" ] by '//form'

晚风撩人 2024-09-18 03:02:51

strip_tags() 函数就是您正在寻找的。

http://us.php.net/manual/en/function。条带标签.php

strip_tags() function is what you are looking for.

http://us.php.net/manual/en/function.strip-tags.php

べ繥欢鉨o。 2024-09-18 03:02:51

我编写这些是为了去除特定的标签和属性。由于它们是正则表达式,因此不能 100% 保证在所有情况下都能工作,但这对我来说是一个公平的权衡:

// Strips only the given tags in the given HTML string.
function strip_tags_blacklist($html, $tags) {
    foreach ($tags as $tag) {
        $regex = '#<\s*' . $tag . '[^>]*>.*?<\s*/\s*'. $tag . '>#msi';
        $html = preg_replace($regex, '', $html);
    }
    return $html;
}

// Strips the given attributes found in the given HTML string.
function strip_attributes($html, $atts) {
    foreach ($atts as $att) {
        $regex = '#\b' . $att . '\b(\s*=\s*[\'"][^\'"]*[\'"])?(?=[^<]*>)#msi';
        $html = preg_replace($regex, '', $html);
    }
    return $html;
}

I wrote these to strip specific tags and attributes. Since they're regex they're not 100% guaranteed to work in all cases, but it was a fair tradeoff for me:

// Strips only the given tags in the given HTML string.
function strip_tags_blacklist($html, $tags) {
    foreach ($tags as $tag) {
        $regex = '#<\s*' . $tag . '[^>]*>.*?<\s*/\s*'. $tag . '>#msi';
        $html = preg_replace($regex, '', $html);
    }
    return $html;
}

// Strips the given attributes found in the given HTML string.
function strip_attributes($html, $atts) {
    foreach ($atts as $att) {
        $regex = '#\b' . $att . '\b(\s*=\s*[\'"][^\'"]*[\'"])?(?=[^<]*>)#msi';
        $html = preg_replace($regex, '', $html);
    }
    return $html;
}
慈悲佛祖 2024-09-18 03:02:51

这个怎么样?

// Strips only the given tags in the given HTML string.
function strip_tags_blacklist($html, $tags) {
    $html = preg_replace('/<'. $tags .'\b[^>]*>(.*?)<\/'. $tags .'>/is', "", $html);
    return $html;
}

how about this?

// Strips only the given tags in the given HTML string.
function strip_tags_blacklist($html, $tags) {
    $html = preg_replace('/<'. $tags .'\b[^>]*>(.*?)<\/'. $tags .'>/is', "", $html);
    return $html;
}
泅渡 2024-09-18 03:02:51

根据 RafaSashi 使用 preg_replace() 的回答,这是一个适用于单个标签或标签数组的版本:

/**
 * @param $str string
 * @param $tags string | array
 * @return string
 */

function strip_specific_tags ($str, $tags) {
  if (!is_array($tags)) { $tags = array($tags); }

  foreach ($tags as $tag) {
    $_str = preg_replace('/<\/' . $tag . '>/i', '', $str);
    if ($_str != $str) {
      $str = preg_replace('/<' . $tag . '[^>]*>/i', '', $_str);
    }
  }
  return $str;
}

Following RafaSashi's answer using preg_replace(), here's a version that works for a single tag or an array of tags:

/**
 * @param $str string
 * @param $tags string | array
 * @return string
 */

function strip_specific_tags ($str, $tags) {
  if (!is_array($tags)) { $tags = array($tags); }

  foreach ($tags as $tag) {
    $_str = preg_replace('/<\/' . $tag . '>/i', '', $str);
    if ($_str != $str) {
      $str = preg_replace('/<' . $tag . '[^>]*>/i', '', $_str);
    }
  }
  return $str;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文