PHP PREG 问题

发布于 2024-08-16 09:16:39 字数 2986 浏览 7 评论 0原文

尽管我很努力,PREG 和我相处不好,所以,我希望你们中的一位 PHP 专家能够提供帮助..

我有一些 HTML 源代码进入 PHP 脚本,我需要删除特定的项目/从源代码中删除。

首先,如果它作为 HTML 的一部分出现(可能是多个实例):

<SPAN class=placeholder title="" jQuery1262031390171="46">[[[SOMETEXT]]]</SPAN>

我希望将其转换为简单的 [[[SOMETEXT]]]

请注意,前缀将始终be(我认为):

<SPAN class=placeholder

.. 后缀将始终是

</SPAN>

(是的,大写 SPAN),但是 title=""jQuery###=" #" 件可能会有所不同。 [[[SOMETEXT]]] 可以是任何东西。 我本质上希望删除 SPAN 标记。

下一步,如果这是 HTML 的一部分(也可能是多个实例):

<span style="" class="placeholder" title="">[[[SOMETEXT]]</span>

..同样的事情 - 只需要 [[[SOMETEXT]]] 部分保留。我认为 部分将始终是前缀,而 (在这种情况下,小写 span 标签)将是后缀。

我知道这可能需要 2 PREG 命令,但希望能够将 html 文本传递到函数中并获得清理/剥离的版本,如下所示:

$dirty_text = $_POST['html_text'];
$clean_text = strip_placeholder_spans($dirty_text);
function strip_placeholder_spans( $in_text ) {
 // all the preg magic happens here, and returns result
}

为了清晰而添加/更新

好吧,得到一些好的反馈,并接近。不过,为了更清楚地说明这一点,这里有一个例子。我想将此文本发送到函数 strip_placeholder_spans() 中:

<blockquote>
<h2 align="center">Firefox: <span class="placeholder" title="">[[[ITEM1]]]</span></h2>
<h2 align="center">IE1:<SPAN class=placeholder title="" jQuery1262031390171="46">[[[ITEM2]]]</SPAN>
</h2>
<h2 align="center">IE2:<SPAN class=placeholder title="" jQuery1262031390412="52">[[[ITEM3]]]</SPAN> 
</h2>
<h2 align="center"><br><font face="Arial, Helvetica, sans-serif">COMPLETE</font></h2>
<p align="center">Your Text Can Go Here</p>
<p align="center"><a href="javascript:self.close()">Close this Window</a></p>
<p align="center"><br></p>
<p align="center"><a href="javascript:self.close()"><br></a></p></blockquote>
<p align="center"></p>

当它返回时,它应该是这样的:

<blockquote>
<h2 align="center">Firefox: [[[ITEM1]]]</h2>
<h2 align="center">IE1:[[[ITEM2]]]</h2>
<h2 align="center">IE2:[[[ITEM3]]]</h2>
<h2 align="center"><br><font face="Arial, Helvetica, sans-serif">COMPLETE</font></h2>
<p align="center">Your Text Can Go Here</p>
<p align="center"><a href="javascript:self.close()">Close this Window</a></p>
<p align="center"><br></p>
<p align="center"><a href="javascript:self.close()"><br></a></p></blockquote>
<p align="center"></p>

As hard as I try, PREG and I don't get along, so, I am hoping one of you PHP gurus can help out ..

I have some HTML source code coming in to a PHP script, and I need specific items stripped out/removed from the source code.

First, if this comes in as part of HTML (could be multiple instances):

<SPAN class=placeholder title="" jQuery1262031390171="46">[[[SOMETEXT]]]</SPAN>

I want it converted into simply [[[SOMETEXT]]]

Note that the prefix will always be (I think):

<SPAN class=placeholder

.. and suffix will always be

</SPAN>

(yes, capital SPAN), but the title="" and jQuery###="#" pieces may be different. [[[SOMETEXT]]] could be anything. I essentially want the SPAN tag removed.

Next, if this comes as part of HTML (also could be multiple instances):

<span style="" class="placeholder" title="">[[[SOMETEXT]]</span>

.. same thing - just want the [[[SOMETEXT]]] part to remain. I think piece will always be prefix, and (in this case, lowercase span tags) will be suffix.

I understand this may probably take 2 PREG commands, but would like to be able to pass in the html text into a function and get a cleaned/stripped version, something like this:

$dirty_text = $_POST['html_text'];
$clean_text = strip_placeholder_spans($dirty_text);
function strip_placeholder_spans( $in_text ) {
 // all the preg magic happens here, and returns result
}

ADDED/UPDATED FOR CLARITY

Ok, getting some good feedback, and getting close. However, to make it clearer, here is an example. I want to sent this text into the function strip_placeholder_spans():

<blockquote>
<h2 align="center">Firefox: <span class="placeholder" title="">[[[ITEM1]]]</span></h2>
<h2 align="center">IE1:<SPAN class=placeholder title="" jQuery1262031390171="46">[[[ITEM2]]]</SPAN>
</h2>
<h2 align="center">IE2:<SPAN class=placeholder title="" jQuery1262031390412="52">[[[ITEM3]]]</SPAN> 
</h2>
<h2 align="center"><br><font face="Arial, Helvetica, sans-serif">COMPLETE</font></h2>
<p align="center">Your Text Can Go Here</p>
<p align="center"><a href="javascript:self.close()">Close this Window</a></p>
<p align="center"><br></p>
<p align="center"><a href="javascript:self.close()"><br></a></p></blockquote>
<p align="center"></p>

and when it comes back, it should be this:

<blockquote>
<h2 align="center">Firefox: [[[ITEM1]]]</h2>
<h2 align="center">IE1:[[[ITEM2]]]</h2>
<h2 align="center">IE2:[[[ITEM3]]]</h2>
<h2 align="center"><br><font face="Arial, Helvetica, sans-serif">COMPLETE</font></h2>
<p align="center">Your Text Can Go Here</p>
<p align="center"><a href="javascript:self.close()">Close this Window</a></p>
<p align="center"><br></p>
<p align="center"><a href="javascript:self.close()"><br></a></p></blockquote>
<p align="center"></p>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

青春如此纠结 2024-08-23 09:16:39

使用 HTML 解析。这是最稳健的解决方案。以下代码适用于您发布的两个代码示例:

$s= <<<STR
<span style="" class="placeholder" title="">[[[SOMETEXT]]</span>
Some Other text & <b>Html</b>
<SPAN class=placeholder title="" jQuery1262031390171="46">[[[SOMETEXT]]]</SPAN>
STR;

preg_match_all('/\<span[^>]+?class="*placeholder"*[^>]+?>([^<]+)?<\/span>/isU', $s, $m);
var_dump($m);

使用正则表达式会产生非常集中的代码。此示例将仅处理非常具体的 HTML 和格式正确的 HTML。例如,它不会解析某些文本<更多文本。如果您可以控制源 HTML,这可能就足够了。

Use an HTML parse. This is the most robust solution. The following code will work for the two code examples you posted:

$s= <<<STR
<span style="" class="placeholder" title="">[[[SOMETEXT]]</span>
Some Other text & <b>Html</b>
<SPAN class=placeholder title="" jQuery1262031390171="46">[[[SOMETEXT]]]</SPAN>
STR;

preg_match_all('/\<span[^>]+?class="*placeholder"*[^>]+?>([^<]+)?<\/span>/isU', $s, $m);
var_dump($m);

Using regular expressions results in very focused code. This example will only handle very specific HTML and well-formed HTML. For instance, it won't parse <span class="placeholder">some text < more text</span>. If you have control over the source HTML this may be good enough.

塔塔猫 2024-08-23 09:16:39

我认为这应该可以解决你的问题

function strip_placeholder_spans( $in_text ) {
preg_match("/>(.*?)<\//", $in_text, $result);
return $result[1]; }

I think this should solve your poble

function strip_placeholder_spans( $in_text ) {
preg_match("/>(.*?)<\//", $in_text, $result);
return $result[1]; }
冧九 2024-08-23 09:16:39

第一步:在处理 HTML 时从工具箱中删除正则表达式。你需要一个解析器。

第二步:下载 PHP 版 simple_html_dom

第三步:解析

$html = str_get_html('<SPAN class=placeholder title="" jQuery1262031390171="46">[[[SOMETEXT]]]</SPAN>');
$spanText = $html->find('span', 1)->innerText;

第四步:盈利!

Edit

$html->find('span.placeholder', 1)->tag, $matches); 将返回您想要的内容。它查找 class=placeholder。

Step one: Remove regular expressions from your toolbox when dealing with HTML. You need a parser.

Step two: Download simple_html_dom for php.

Step three: Parse

$html = str_get_html('<SPAN class=placeholder title="" jQuery1262031390171="46">[[[SOMETEXT]]]</SPAN>');
$spanText = $html->find('span', 1)->innerText;

Step four: Profit!

Edit

$html->find('span.placeholder', 1)->tag, $matches); will return what you want. It looks for class=placeholder.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文