将文本转换为链接 - php 正则表达式问题

发布于 2025-01-07 23:22:15 字数 623 浏览 0 评论 0原文

我在将纯文本转换为网址时遇到了一些问题。 我喜欢的是,如果我有这样的文本:www.google.com,它会转换为

<a href="www.google.com" target="_blank">www.google.com</a>

我是一个正则表达式菜鸟,但我尝试了这个:

$description = preg_replace('@(www.([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)?)@', '<a href="$1" target="_blank">$1</a>', $description);

描述 var 是一段文本,它可以包含未转换的文本网址的.

通过上面的代码,我得到了这个链接:

<a target="_blank">www.google.com</a>

所以 href 部分被省略了。对于正则表达式向导来说,这一定是小菜一碟,所以提前感谢您的每一个帮助。

如果有另一种(更好?)方法将纯文本转换为网址,你可以这么说,我会尝试一下。

I'm having a bit of a problem with converting plain text to an url.
What I like to have is, if I have text like this: www.google.com, it's converted to

<a href="www.google.com" target="_blank">www.google.com</a>

I'm kind of a RegEx noob, but I tried this:

$description = preg_replace('@(www.([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)?)@', '<a href="$1" target="_blank">$1</a>', $description);

The description var is a piece of text, which CAN contain unconverted url's.

With the code above, I get this as link:

<a target="_blank">www.google.com</a>

So the href part is left out. This must be a piece of cake for you RegEx wizards out there, so thanks in advance for every help.

If there is another (better?) way to convert plain text to url's, you can say so and I'll try it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

迷爱 2025-01-14 23:22:15

如果您唯一的问题是链接错误地指向 www.google.com 而不是完全限定的网址,例如 http://www.google.com,那么正确的替换是:

$description = preg_replace('@(www.([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)?)@', '<a href="http://$1" target="_blank">$1</a>', $description);

If your only problem is that the link incorrectly points towards www.google.com instead of the fully qualified URL, such as http://www.google.com, then the correct replacement would be:

$description = preg_replace('@(www.([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)?)@', '<a href="http://$1" target="_blank">$1</a>', $description);
ぺ禁宫浮华殁 2025-01-14 23:22:15

www.example.com 在现代浏览器中将无法正常工作,因为 href 值将仅附加到当前页面 url ,例如http://example.com/www.example.com。您需要指定协议,即。 http/https 等。

以下将替换所有以 ftp、http、https 和 file 开头的文本“链接”,并使用 html a 标签

<?php

    $pattern = '/(www|ftp|http|https|file)(:\/\/)?[\S]+(\b|$)/i';
    $string = 'hello http://example.com https://graph.facebook.com    http://www.example.com www.google.com';

    function create_a_tags( $matches ){

        $url = $matches[0];
        if ( 'www' == $matches[1] ){
            $url = 'http://' . $matches[0];
        }
        $escaped = htmlspecialchars($matches[0]);
        return sprintf( '<a href="%s">%s</a>', $url, $escaped );
    }

    echo preg_replace_callback( $pattern, 'create_a_tags', $string );

?>

打印

hello <a href="http://example.com">http://example.com</a>
<a href="https://graph.facebook.com">https://graph.facebook.com</a>
<a href="http://www.example.com">http://www.example.com</a>
<a href="http://www.google.com">www.google.com</a>

<a href="www.example.com">www.example.com</a> will not work correctly in modern browsers because the href value will be just appended to the current page url, e.g. http://example.com/www.example.com. You need to specify the protocol, ie. http/https, etc.

The following will replace all text "links" starting with ftp, http, https and file with html a tags

<?php

    $pattern = '/(www|ftp|http|https|file)(:\/\/)?[\S]+(\b|$)/i';
    $string = 'hello http://example.com https://graph.facebook.com    http://www.example.com www.google.com';

    function create_a_tags( $matches ){

        $url = $matches[0];
        if ( 'www' == $matches[1] ){
            $url = 'http://' . $matches[0];
        }
        $escaped = htmlspecialchars($matches[0]);
        return sprintf( '<a href="%s">%s</a>', $url, $escaped );
    }

    echo preg_replace_callback( $pattern, 'create_a_tags', $string );

?>

prints

hello <a href="http://example.com">http://example.com</a>
<a href="https://graph.facebook.com">https://graph.facebook.com</a>
<a href="http://www.example.com">http://www.example.com</a>
<a href="http://www.google.com">www.google.com</a>
小姐丶请自重 2025-01-14 23:22:15

不久前,我们比较了 URL 验证和识别的不同方法。请参阅正则表达式

我建议您放弃正则表达式并使用 gruber 修订版 代替。 (PHP 5.3) 解决方案可能如下所示:

<?php

$string = 'hello 
http://example.com 
https://graph.facebook.com 
http://www.example.com
www.google.com
ftp://example.com';

$string = preg_replace_callback('#(?i)\b((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))#iS', function($m) {
    // use http as default protocol, if none given
    if (strpos($m[0], '://') === false) {
        $m[0] = 'http://' . $m[0];
    }
    // text -> html is a context switch, take care of special characters
    $_m = htmlspecialchars($m[0]);
    return '<a href="' . $_m . '" target="_blank">' . $_m . '</a>';
}, $string);

echo $string, "\n";

Quite a while ago we compared different approaches to URL verification and identification. See the table of regular expressions.

I suggest you drop your regex and use the gruber revised instead. A (PHP 5.3) solution could look like:

<?php

$string = 'hello 
http://example.com 
https://graph.facebook.com 
http://www.example.com
www.google.com
ftp://example.com';

$string = preg_replace_callback('#(?i)\b((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))#iS', function($m) {
    // use http as default protocol, if none given
    if (strpos($m[0], '://') === false) {
        $m[0] = 'http://' . $m[0];
    }
    // text -> html is a context switch, take care of special characters
    $_m = htmlspecialchars($m[0]);
    return '<a href="' . $_m . '" target="_blank">' . $_m . '</a>';
}, $string);

echo $string, "\n";
风渺 2025-01-14 23:22:15

我找到了解决方案。它确实与正则表达式没有任何关系,这是正确的。我的同事在头部添加了这行 jquery 代码:

$("a").removeAttr('href');

所以显然 href 属性被删除了。我没有看这个,因为我确信这是一个 php/regex 问题。删除这个问题就解决了。

我意识到这是一个愚蠢的错误,你不可能解决这个问题,所以感谢大家的帮助,给你们+1。

I've found the solution. It indeed didn't have anything to do with the RegEx, that was correct. My coworker added this line of jquery code in the head:

$("a").removeAttr('href');

So obviously the href attribute was being removed. I didn't look at this because I was sure this was a php/regex problem. Removing this fixed the problem.

I realize this was a stupid error and it was impossible for you to solve this, so thanks all for helping, +1 to you guys.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文