preg_match_all - 正则表达式帮助

发布于 2024-10-17 04:17:26 字数 569 浏览 2 评论 0原文

我得到了以下正则表达式:

 if (preg_match_all("'(http://)?(www[.])?(youtube|vimeo)[^\s]+'is",$prova,$n))
 {
     foreach ($n[3] as $key => $site)
     {
         $video_links[$site][] = $n[0][$key];
     }

但是,如果我有一个类似的字符串:

“你好,请观看我的 vimeo 视频: http://www.vimeo.com..../ 非常好嗯?"< /p>

相反仅接收网址,我还收到了 vimeo 这个词,

我相信正则表达式的重试次数超过了应有的次数,我只想检索它找到的网址,而不是“vimeo”或“youtube”的所有引用。我

可以请求您帮助缩小此表达式的范围,以便仅检索 URL 吗?

I've been given the following regex expression:

 if (preg_match_all("'(http://)?(www[.])?(youtube|vimeo)[^\s]+'is",$prova,$n))
 {
     foreach ($n[3] as $key => $site)
     {
         $video_links[$site][] = $n[0][$key];
     }

However, if I have a string like:

"hello, look at my vimeo video here:
http://www.vimeo.com..../ very nice hm?"

Instead of receiving only the url, I'm getting ALSO the word vimeo.

I believe the regex expression is retring more then it should and I would like to retrive ONLY the urls that it finds, not every reference of "vimeo" or "youtube".

Can I request your help in order to narrow the scope of this expression, so that only the URLs are retrieved ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

阳光①夏 2024-10-24 04:17:26

正则表达式中的第一个问号 ? 是不需要的。它使前面的搜索字符串成为可选的,因此也匹配文本中的裸 vimeo 单词。尝试:

preg_match_all("'(http://)(www[.])?(youtube|vimeo)[.][^\s]+'is",

提示:如果您想排除经常搞乱此类 url 搜索的典型插点,请在末尾添加 (? 。


作为替代方案,可以使用 http:// 和 www。可选,但取决于路径的存在:

preg_match_all("'(http://|www[.])*(youtube|vimeo)[.]\w+/[^\s]+'is",

First question mark ? in the regex is unneeded. It makes the preceeding search strings optional, thus also match the bare vimeo word in texts. Try:

preg_match_all("'(http://)(www[.])?(youtube|vimeo)[.][^\s]+'is",

Tip: add (?<![,.)]) at the end if you want to exclude typical interpunction that often screws up such url searches.


As alternative, with http:// and www. optional, but depending on presence of a path:

preg_match_all("'(http://|www[.])*(youtube|vimeo)[.]\w+/[^\s]+'is",
深海不蓝 2024-10-24 04:17:26

也许下面的代码可以提供一些帮助:

<?php
    //Test string
    $prova = "\"hello, look at my <strong>vimeo</strong> video here:  <a href=\"http://www.vimeo.com..../\" rel=\"nofollow\">http://www.vimeo.com..../</a> very nice hm?\"";
    $prova .= " vimeo vimeo.com/something?id=somethingcrazy&testing=true  ";
    //if we match then capture all matches
    if (preg_match_all("'(http://)?(www\.)?(youtube|vimeo)\.([a-z0-9_/?&+=.]+)'is",$prova,$n)){
        foreach ($n[0] as $key => $site){
            //for each match that matched the whole pattern
            //save the match as a site
            $video_links[$site][] = $n[0][$key];
        }
    }
    //display results
    print_r($video_links);
?>

这不会与单词 vimeo 匹配。它将匹配 vimeo.com/something?id=somethingcrazy&testing=true 并且将匹配 http:// www.vimeo.com..../ 两次。

Maybe the following code can help out a bit:

<?php
    //Test string
    $prova = "\"hello, look at my <strong>vimeo</strong> video here:  <a href=\"http://www.vimeo.com..../\" rel=\"nofollow\">http://www.vimeo.com..../</a> very nice hm?\"";
    $prova .= " vimeo vimeo.com/something?id=somethingcrazy&testing=true  ";
    //if we match then capture all matches
    if (preg_match_all("'(http://)?(www\.)?(youtube|vimeo)\.([a-z0-9_/?&+=.]+)'is",$prova,$n)){
        foreach ($n[0] as $key => $site){
            //for each match that matched the whole pattern
            //save the match as a site
            $video_links[$site][] = $n[0][$key];
        }
    }
    //display results
    print_r($video_links);
?>

This will not match the word vimeo. It will match vimeo.com/something?id=somethingcrazy&testing=true and it will match http://www.vimeo.com..../ twice.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文