preg_match_all - 正则表达式帮助

发布于 2024-10-17 04:17:26 字数 569 浏览 2 评论 0原文

我得到了以下正则表达式：

 if (preg_match_all("'(http://)?(www[.])?(youtube|vimeo)[^\s]+'is",$prova,$n))
 {
     foreach ($n[3] as $key => $site)
     {
         $video_links[$site][] = $n[0][$key];
     }

但是，如果我有一个类似的字符串：

“你好，请观看我的 vimeo 视频： http://www.vimeo.com..../ 非常好嗯？"< /p>

相反仅接收网址，我还收到了 vimeo 这个词，

我相信正则表达式的重试次数超过了应有的次数，我只想检索它找到的网址，而不是“vimeo”或“youtube”的所有引用。我

可以请求您帮助缩小此表达式的范围，以便仅检索 URL 吗？

原文

I've been given the following regex expression:

 if (preg_match_all("'(http://)?(www[.])?(youtube|vimeo)[^\s]+'is",$prova,$n))
 {
     foreach ($n[3] as $key => $site)
     {
         $video_links[$site][] = $n[0][$key];
     }

However, if I have a string like:

"hello, look at my vimeo video here:
http://www.vimeo.com..../ very nice hm?"

Instead of receiving only the url, I'm getting ALSO the word vimeo.

I believe the regex expression is retring more then it should and I would like to retrive ONLY the urls that it finds, not every reference of "vimeo" or "youtube".

Can I request your help in order to narrow the scope of this expression, so that only the URLs are retrieved ?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

阳光①夏 2024-10-24 04:17:26

正则表达式中的第一个问号 ? 是不需要的。它使前面的搜索字符串成为可选的，因此也匹配文本中的裸 vimeo 单词。尝试：

preg_match_all("'(http://)(www[.])?(youtube|vimeo)[.][^\s]+'is",

提示：如果您想排除经常搞乱此类 url 搜索的典型插点，请在末尾添加 (? 。

作为替代方案，可以使用 http:// 和 www。可选，但取决于路径的存在：

preg_match_all("'(http://|www[.])*(youtube|vimeo)[.]\w+/[^\s]+'is",

First question mark ? in the regex is unneeded. It makes the preceeding search strings optional, thus also match the bare vimeo word in texts. Try:

preg_match_all("'(http://)(www[.])?(youtube|vimeo)[.][^\s]+'is",

Tip: add (?<![,.)]) at the end if you want to exclude typical interpunction that often screws up such url searches.

As alternative, with http:// and www. optional, but depending on presence of a path:

preg_match_all("'(http://|www[.])*(youtube|vimeo)[.]\w+/[^\s]+'is",

回复收藏 0 原文

深海不蓝 2024-10-24 04:17:26

也许下面的代码可以提供一些帮助：

<?php
    //Test string
    $prova = "\"hello, look at my <strong>vimeo</strong> video here:  <a href=\"http://www.vimeo.com..../\" rel=\"nofollow\">http://www.vimeo.com..../</a> very nice hm?\"";
    $prova .= " vimeo vimeo.com/something?id=somethingcrazy&testing=true  ";
    //if we match then capture all matches
    if (preg_match_all("'(http://)?(www\.)?(youtube|vimeo)\.([a-z0-9_/?&+=.]+)'is",$prova,$n)){
        foreach ($n[0] as $key => $site){
            //for each match that matched the whole pattern
            //save the match as a site
            $video_links[$site][] = $n[0][$key];
        }
    }
    //display results
    print_r($video_links);
?>

这不会与单词 vimeo 匹配。它将匹配 vimeo.com/something?id=somethingcrazy&testing=true 并且将匹配 http:// www.vimeo.com..../ 两次。

Maybe the following code can help out a bit:

<?php
    //Test string
    $prova = "\"hello, look at my <strong>vimeo</strong> video here:  <a href=\"http://www.vimeo.com..../\" rel=\"nofollow\">http://www.vimeo.com..../</a> very nice hm?\"";
    $prova .= " vimeo vimeo.com/something?id=somethingcrazy&testing=true  ";
    //if we match then capture all matches
    if (preg_match_all("'(http://)?(www\.)?(youtube|vimeo)\.([a-z0-9_/?&+=.]+)'is",$prova,$n)){
        foreach ($n[0] as $key => $site){
            //for each match that matched the whole pattern
            //save the match as a site
            $video_links[$site][] = $n[0][$key];
        }
    }
    //display results
    print_r($video_links);
?>

This will not match the word vimeo. It will match vimeo.com/something?id=somethingcrazy&testing=true and it will match http://www.vimeo.com..../ twice.

回复收藏 0 原文

~没有更多了~