在 preg_replace 中使用正则表达式来匹配 html href 锚标记

发布于 2024-11-16 16:02:53 字数 2409 浏览 2 评论 0原文

我正在尝试使用 preg_replace 替换

<a href="WWW.ANYURL.COM">DISPLAY_TEXT</a>

这里

<a href="WWW.ANYURL.COM">DISPLAY_TEXT</a>

是我的代码:

$string = htmlentities(mysql_real_escape_string($string1)); 
$newString = preg_replace('#&lt;a\ href=&quot;([^&]*)&quot;&gt;([^&]*)&lt;/a&gt;#','<a href="$1">$2</a>',$string);

如果我进行有限的测试,例如:

$newString = preg_replace('#&lt;a\ href#','TEST',$string);

then

&lt;a href=&quot;WWW.ANYURL.COM&quot;&gt;DISPLAYTEXT&lt;/a&gt;

变成

TEST=&quot;WWW.ANYURL.COM&quot;&gt;DISPLAYTEXT&lt;/a&gt;

但是如果我尝试让它也匹配“=”,它的行为就好像找不到匹配项,即

$newString = preg_replace('#&lt;a\ href=#','TEST',$string);

返回原始内容不变:

&lt;a href=&quot;WWW.ANYURL.COM&quot;&gt;DISPLAY_TEXT&lt;/a&gt;

我已经研究了几个小时,任何帮助将不胜感激。

编辑:上下文中的代码

$title = clean_input($_POST['title']);
$story = clean_input($_POST['story']);

function clean_input($string) 
  { 
  if(get_magic_quotes_gpc())
  {
   $string = stripslashes($string);
  }
$string = htmlentities(mysql_real_escape_string($string)); 
$findValues = array("&lt;b&gt;","&lt;/b&gt;");
$newValues = array("<b>", "</b>");
$newString = str_replace($findValues, $newValues, $string);
$newString2 = preg_replace('#&lt;a\ href=&quot;([^&]*)&quot;&gt;([^&]*)&lt;/a&gt;#','<a href="$1">$2</a>',$newString);
return $newString2;
}

示例 $story = Lorem ipsum dolor sat amet,consectetur adipiscing elit。 Google Vivamus quis sem felis。 Morbi vitae neque ac neque blanditmalesuada lobortis sat amet justo。 Donec convallis, nibh ut lacinia tempor, neque felis scelerisque nibh, at feugiat lectuserat in nulla.在 et euismod nunc 中。 <有害代码>Pellentesque vitae ante orci, vitae ultrices neque。 Yahoo 在非空智人中,前庭 faucibus metus。 Fusce egestas viverra arcu,ac sagittis leo facilisis in. Nulla facilisi。

我只希望允许 href 和粗体等少数标签作为代码通过。

I'm trying to use preg_replace to replace

<a href="WWW.ANYURL.COM">DISPLAY_TEXT</a>

with

<a href="WWW.ANYURL.COM">DISPLAY_TEXT</a>

here is my code:

$string = htmlentities(mysql_real_escape_string($string1)); 
$newString = preg_replace('#<a\ href="([^&]*)">([^&]*)</a>#','<a href="$1">$2</a>',$string);

If I do limited tests such as:

$newString = preg_replace('#<a\ href#','TEST',$string);

then

<a href="WWW.ANYURL.COM">DISPLAYTEXT</a>

becomes

TEST="WWW.ANYURL.COM">DISPLAYTEXT</a>

But if I try to get it to also match the "=" it acts as if it could't find a match, i.e.

$newString = preg_replace('#<a\ href=#','TEST',$string);

returns the original unchanged:

<a href="WWW.ANYURL.COM">DISPLAY_TEXT</a>

I've been going at this for a couple hours, any help would be greatly appreciated.

EDIT: code in context

$title = clean_input($_POST['title']);
$story = clean_input($_POST['story']);

function clean_input($string) 
  { 
  if(get_magic_quotes_gpc())
  {
   $string = stripslashes($string);
  }
$string = htmlentities(mysql_real_escape_string($string)); 
$findValues = array("<b>","</b>");
$newValues = array("<b>", "</b>");
$newString = str_replace($findValues, $newValues, $string);
$newString2 = preg_replace('#<a\ href="([^&]*)">([^&]*)</a>#','<a href="$1">$2</a>',$newString);
return $newString2;
}

Sample $story = Lorem ipsum dolor sit amet, consectetur adipiscing elit. <a href="www.google.com">Google</a> Vivamus quis sem felis. Morbi vitae neque ac neque blandit malesuada lobortis sit amet justo. Donec convallis, nibh ut lacinia tempor, neque felis scelerisque nibh, at feugiat lectus erat in nulla. In et euismod nunc. <pernicious code></code>Pellentesque vitae ante orci, vitae ultrices neque. <a href="www.yahoo.com">Yahoo</a> In non nulla sapien, vestibulum faucibus metus. Fusce egestas viverra arcu, <b>ac</b> sagittis leo facilisis in. Nulla facilisi.

I want only a few tags like href and bold to be allowed through as code.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

∞梦里开花 2024-11-23 16:02:53

您无需手动替换任何内容。如果这是您的整个输入字符串,请使用 html_entity_decode() 来转义回到 <>


同样,您的正则表达式按照示例文本的预期工作。

您的问题是过早mysql_real_escape_string()调用。它将反斜杠添加到 html 中的 " 双引号中,这就是反向转换失败的原因(您的正则表达式未准备好查找 \")。

避免这种情况摆脱丑陋的 clean_string() hack 和 ma​​gic_quotes 作为 按照手册的建议,您必须在之前执行数据库转义。 插入数据库,而不是更早(或者更好的是使用更简单的 PDO 与准备好的语句。)

还要避免 $newString123 变量重复,只需覆盖重写字符串时已有的变量即可。

You don't need to manually replace anything. If this is your whole input string, then use html_entity_decode() to turn the escapes back into < and >.


Again, your regex works as intended with the sample text.

Your problem is the premature mysql_real_escape_string() call. It adds backslashes to the " double quotes in your html, and that's why back-converting fails (your regex is not prepared for finding \").

Avoid that. Get rid of the ugly clean_string() hack and magic_quotes as advised by the manual. You must do the database escaping right before inserting into the database, not earlier. (Or better yet use the easier PDO with prepared statements.)

Also avoid the $newString123 variable duplicates, just overwrite the one you already have when rewriting strings.

因为看清所以看轻 2024-11-23 16:02:53

你也可以这样做:

$str = "<a href="WWW.ANYURL.COM">DISPLAY_TEXT</a>";
echo "Your html code is thus: " . htmlspecialchars_decode($str);

You could also do it like this:

$str = "<a href="WWW.ANYURL.COM">DISPLAY_TEXT</a>";
echo "Your html code is thus: " . htmlspecialchars_decode($str);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文