PHP 正则表达式中的引号转义错误

发布于 2024-12-25 09:40:44 字数 968 浏览 1 评论 0原文

我是 PHP 新手,尝试在下面的代码中用 google.com 替换 URL 模式。

    $textStr = "Test string contains http://foo.com/more_(than)_one_(parens)
http://foo.com/blah_(wikipedia)#cite-1
http://foo.com/blah_(wikipedia)_blah#cite-1
http://foo.com/unicode_(?)_in_parens
http://foo.com/(something)?after=parens
more urls foo.ca/me some other text";

$pattern = '(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)((?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))*)';

$textStr = preg_replace($pattern, "google.com", $textStr); 

echo $textStr;

我在 http://daringfireball.net/2010/07/improved_regex_for_matching_urls 找到了正则表达式模式,但是我无法成功转义模式中的单引号、双引号。

目前我收到消息 - 警告: preg_replace() 未知修饰符 '\' 但我使用斜杠()来转义 {};:\'" 中的单引号

有人可以帮我解决上面的代码吗?

I am new to PHP and trying to replace a URL pattern with google.com in the code below.

    $textStr = "Test string contains http://foo.com/more_(than)_one_(parens)
http://foo.com/blah_(wikipedia)#cite-1
http://foo.com/blah_(wikipedia)_blah#cite-1
http://foo.com/unicode_(?)_in_parens
http://foo.com/(something)?after=parens
more urls foo.ca/me some other text";

$pattern = '(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)((?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))*)';

$textStr = preg_replace($pattern, "google.com", $textStr); 

echo $textStr;

I found the regular expression pattern at http://daringfireball.net/2010/07/improved_regex_for_matching_urls but I have not been able to escape the single quote, double quotes in the pattern successfully.

Currently I get the message -- Warning: preg_replace() Unknown modifier '\'
But I used the slash() to escape the single quote in {};:\'"

Can someone please help me with the code above?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

不必了 2025-01-01 09:40:44

首先,对于 preg_replace ,您必须使用 / 分隔正则表达式,如下所示:

/\b((?:https: ... etc etc)/

其次,因为您使用 / 分隔正则表达式您必须使用反斜杠转义任何 / 。所以 https:// -> https:\/\/

第三,修饰符 (?i) 位于尾部斜杠之后:

`/\b((?:https: .. etc etc)/i`

Try (changes made: escaped /, moving regex from (?i)regex/regex/i):

$pattern = '/\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)((?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))*)/i';
$textStr = preg_replace($pattern, "google.com", $textStr); 

echo $textStr;

现在,由于 $pattern 匹配整个 URL,您将直接退出:

"Test string contains google.com
google.com
google.com
google.com
google.com
more urls google.com some other text"

所以总而言之,我推荐 @Ampere 的答案(但是这有一个更宽松的正则表达式您的原始版本),或者使用捕获括号和反向引用来执行类似 preg_replace($pattern,'google.com/\2',$textStr) 的操作(但要适当修改您的捕获括号,因为这不会使用您当前的捕获支架排列)。

此站点对于测试非常有用。

In the first place for preg_replace you have to delimit your regular expression by /, as in:

/\b((?:https: ... etc etc)/

Second, since you delimit your regular expressions with / you have to escape any / with a backslash. So https:// -> https:\/\/.

Third, your modifiers (?i) go after the trailing slash:

`/\b((?:https: .. etc etc)/i`

Try (changes made: escaped /, moved regex from (?i)regex to /regex/i):

$pattern = '/\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)((?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))*)/i';
$textStr = preg_replace($pattern, "google.com", $textStr); 

echo $textStr;

Now, since $pattern matches the entire URL you will just get out:

"Test string contains google.com
google.com
google.com
google.com
google.com
more urls google.com some other text"

so all in all, I recommend either @Ampere's answer (but this has a looser regex than your original), or using capturing brackets and backreferences to do something like preg_replace($pattern,'google.com/\2',$textStr) (but modify your capturing brackets appropriately, as this will not work with your current capturing bracket arrangement).

This site is useful for testing things out.

万劫不复 2025-01-01 09:40:44
$patterrn='/([wW]{3,3}\.|)[A-Za-z0-9]+?\./';
$text="Test string contains http://foo.com/more_(than)_one_(parens)
http://foo.com/blah_(wikipedia)#cite-1
http://foo.com/blah_(wikipedia)_blah#cite-1
http://foo.com/unicode_(?)_in_parens
http://foo.com/(something)?after=parens
more urls foo.ca/me some other text";
$output = preg_replace($patterrn,"abc.",$text);
print_r($output);

输出将是

Test string contains http://abc.com/more_(than)_one_(parens) http://abc.com/blah_(wikipedia)#cite-1 http://abc.com/blah_(wikipedia)_blah#cite-1 http://abc.com/unicode_(?)_in_parens http://abc.com/(something)?after=parens more urls abc.ca/me some other text
$patterrn='/([wW]{3,3}\.|)[A-Za-z0-9]+?\./';
$text="Test string contains http://foo.com/more_(than)_one_(parens)
http://foo.com/blah_(wikipedia)#cite-1
http://foo.com/blah_(wikipedia)_blah#cite-1
http://foo.com/unicode_(?)_in_parens
http://foo.com/(something)?after=parens
more urls foo.ca/me some other text";
$output = preg_replace($patterrn,"abc.",$text);
print_r($output);

the output will be ,

Test string contains http://abc.com/more_(than)_one_(parens) http://abc.com/blah_(wikipedia)#cite-1 http://abc.com/blah_(wikipedia)_blah#cite-1 http://abc.com/unicode_(?)_in_parens http://abc.com/(something)?after=parens more urls abc.ca/me some other text
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文