PHP:仅删除前几个空的

标签

发布于 2024-10-06 22:32:53 字数 791 浏览 0 评论 0原文

我有一个定制开发的 CMS,用户可以在其中输入一些内容到富文本字段(ckeditor)中。

用户只需从另一个文档复制粘贴数据即可。有时,数据开头有空的

标记。以下是数据示例:

<p></p>
<p></p>
<p></p>
<p>Data data data data</p>
<p>Data data data data</p>
<p>Data data data data</p>
<p>Data data data data</p>
<p></p>
<p></p>
<p>Data data data data</p>
<p>Data data data data</p>
<p></p>

我不想删除所有空

标签,只想删除实际数据之前的标签,即前 3 个

在本例中为 code> 标签。

我怎样才能做到这一点?

编辑:澄清一下,我需要一个 PHP 解决方案。 JavaScript 不行。

有没有一种方法可以将所有

标签收集到一个数组中,然后迭代并删除,直到遇到包含数据的标签?

I have a custom developed CMS where users can enter some content into a rich text field (ckeditor).

Users simply copy-paste data from another document. Sometimes the data has empty <p> tags at the beginning. Here's a sample of the data:

<p></p>
<p></p>
<p></p>
<p>Data data data data</p>
<p>Data data data data</p>
<p>Data data data data</p>
<p>Data data data data</p>
<p></p>
<p></p>
<p>Data data data data</p>
<p>Data data data data</p>
<p></p>

I don't want to remove all the empty <p> tags, only the ones before the actual data, the top 3 <p> tags in this case.

How can I do that?

Edit: To clarify, I need a PHP solution. Javascript won't do.

Is there a way I can gather all <p> tags in an array, then iterate and delete until I encounter one with data?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

笨死的猪 2024-10-13 22:32:53

请不要对不规则字符串使用正则表达式:它 搅动沉睡的神。相反,使用 XPath:

function strip_opening_lines($html) {  
  $dom = new DOMDocument();
  $dom->preserveWhitespace = FALSE;
  $dom->loadHTML($html);

  $xpath = new DOMXPath($dom);
  $nodes = $xpath->query("//p");

  foreach ($nodes as $node) {
    // Remove non-significant whitespace.
    $trimmed_value = trim($node->nodeValue);

    // Check to see if the node is empty (i.e. <p></p>). 
    // If so, remove it from the stack.
    if (empty($trimmed_value)) {
      $node->parentNode->removeChild($node);
    }
    // If we found a non-empty node, we're done. Break out.
    else {
      break;
    }
  }
  $parsed_html = $dom->saveHTML();

  // DOMDocument::saveHTML adds a DOCTYPE, <html>, and <body> 
  // tags to the parsed HTML. Since this is regular data, 
  // we can use regular expressions.
  preg_match('#<body>(.*?)<\/body>#is', $parsed_html, $matches);

  return $matches[1];
}

所提供的所有正则表达式解决方案都不好的原因:

  • 不会将空段落元素与属性匹配(例如

  • )不匹配实际上不为空的空段落元素(例如

Please, don't use regular expressions for irregular strings: it stirs the sleeping god. Instead, use XPath:

function strip_opening_lines($html) {  
  $dom = new DOMDocument();
  $dom->preserveWhitespace = FALSE;
  $dom->loadHTML($html);

  $xpath = new DOMXPath($dom);
  $nodes = $xpath->query("//p");

  foreach ($nodes as $node) {
    // Remove non-significant whitespace.
    $trimmed_value = trim($node->nodeValue);

    // Check to see if the node is empty (i.e. <p></p>). 
    // If so, remove it from the stack.
    if (empty($trimmed_value)) {
      $node->parentNode->removeChild($node);
    }
    // If we found a non-empty node, we're done. Break out.
    else {
      break;
    }
  }
  $parsed_html = $dom->saveHTML();

  // DOMDocument::saveHTML adds a DOCTYPE, <html>, and <body> 
  // tags to the parsed HTML. Since this is regular data, 
  // we can use regular expressions.
  preg_match('#<body>(.*?)<\/body>#is', $parsed_html, $matches);

  return $matches[1];
}

Reasons why all the regex solutions presented are bad:

  • Won't match empty paragraph elements with attributes (e.g. <p class="foo"></p>)
  • Won't match empty paragraph elements that are not literally empty (e.g. <p> </p>)
指尖上得阳光 2024-10-13 22:32:53

通常我建议不要使用正则表达式来解析 HTML,但这似乎是无害的:

$html = preg_replace('!^(<p></p>\s*)+!', '', $html);

Normally I would advise against using a regular expression to parse HTML, but this one seems harmless:

$html = preg_replace('!^(<p></p>\s*)+!', '', $html);
暗喜 2024-10-13 22:32:53

使用

$html = preg_replace ("~^(<p><\/p>[\s\n]*)*~iUmx", "", $html);

Use

$html = preg_replace ("~^(<p><\/p>[\s\n]*)*~iUmx", "", $html);
﹏半生如梦愿梦如真 2024-10-13 22:32:53

您可以在javascript中执行此操作,一旦执行粘贴操作,使用正则表达式去除不需要的标签,

您的代码将类似于,

document.getElementById("id of rich text field").onkeyup = stripData; 
document.getElementById("id of rich text field").onmouseup = stripData; 

function stripData(){
    document.getElementById("id of rich text field").value = document.getElementById("id of rich text field").value.replace(/\<p\>\<\/p\>/g,"");
}

删除初始空,

编辑:仅

 function stripData(){
        var dataStr = document.getElementById("id of rich text field").value 
        while(dataStr.match(/^\<p\>\<\/p\>/g)) {
           dataStr  = dataStr .replace(/^\<p\>\<\/p\>/g,"");
        }
        document.getElementById("id of rich text field").value = dataStr;
 }

You can do it in javascript, as soon as performs paste operation, strip off unwanted tags using regular expressions,

your code will be like,

document.getElementById("id of rich text field").onkeyup = stripData; 
document.getElementById("id of rich text field").onmouseup = stripData; 

function stripData(){
    document.getElementById("id of rich text field").value = document.getElementById("id of rich text field").value.replace(/\<p\>\<\/p\>/g,"");
}

Edit: To remove initial empty

only,

 function stripData(){
        var dataStr = document.getElementById("id of rich text field").value 
        while(dataStr.match(/^\<p\>\<\/p\>/g)) {
           dataStr  = dataStr .replace(/^\<p\>\<\/p\>/g,"");
        }
        document.getElementById("id of rich text field").value = dataStr;
 }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文