youtube正则表达式吞掉剩余的文本
我正在对一段文本执行 preg_match_all
和 str_replace
来获取 YouTube-url 并将其替换为正确的嵌入代码。
假设我有以下文本块:
"bla bla bla bla <-youtube-url-> last few words"
一切正常 - youtube-url 被嵌入代码等替换。但是,运行 str_replace 后,“最后几个单词”从最终输出中消失。我怀疑正则表达式吞掉了 url 之后的所有内容...这就是我用来匹配和提取 YouTube ID 的内容:
%(?:youtube\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/ ]{11})%i
任何帮助将不胜感激!
更新:
我刚刚发现只有当 youtube url 有任何尾随参数时才会出现问题。下面的输入吞掉了最后几个单词:
'www.youtube.com/watch?v=XXXXXXXXX¶meter=data last few words'
但是如果输入是这样的:
'www.youtube.com/watch?v=XXXXXXXXX last few words'
它工作得很好。任何人都可以帮助对正则表达式进行所需的调整吗?
I'm doing preg_match_all
and str_replace
on a block of text to grab YouTube-urls and replace them with the correct embed code.
Let's say I have the following block of text:
"bla bla bla bla <-youtube-url-> last few words"
Everything works fine - the youtube-url is replaced with the embed code etc. However, the "last few words" disappears from the final output after str_replace is run. I'm suspecting that the regex is swallowing everything after the url... This is what I'm using to match and extract YouTube ID's:
%(?:youtube\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/ ]{11})%i
Any help would be greatly appreciated!
Update:
I just discovered that the problem only happens if the youtube url has any trailing parameters. The following input swallows last few words:
'www.youtube.com/watch?v=XXXXXXXXX¶meter=data last few words'
But if the input is like this:
'www.youtube.com/watch?v=XXXXXXXXX last few words'
it works fine. Can anyone help with the needed adjustments for the regular expression?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我的不好。正如我最初怀疑的那样,正则表达式没有问题。
我将用户输入传递给 PHP 处理程序,而不通过 encodeURIComponent()< 转义输入/a> 首先。因此,处理程序假定
¶meter=data
是下一个输入参数 - 导致 POST 变量损坏。抱歉我的无能,感谢大家的帮助!
My bad. There was no problem with the regex, as I first suspected.
I was passing the user input to the PHP handler without escaping the input via encodeURIComponent() first. Thus, the handler assumed
¶meter=data
was the next input parameter - resulting in a broken POST variable.Sorry for my incompetence, and thanks for all the help!
我通常会分解复杂的交替来找出发生了什么。
看来您可能对最后一个术语
[^"&?/ ]{11}
有疑问,但不确定你想做什么。 (下面是 Perl 语言)
输出:
I usually break up complicated alternations to find out whats going on.
It appears you might have trouple with the last term
[^"&?/ ]{11}
, but not surewhat you are trying to do. (below is in Perl)
Output:
将
.+
更改为\S+
,这样就不会捕获空格作为正则表达式的一部分。.*
捕获了整行,而正则表达式的其余部分没有执行任何操作。Change the
.+
to\S+
so that you don't capture whitespace as part of the regex.The
.*
was capturing the entire line, and the rest of your regex wasn't doing anything.我不清楚你到底想做什么。但我建议您尝试使用正则表达式测试工具 - 例如 这个,但还有其他的。它可以让您直观地检查正则表达式的结果。
I'm not clear on what exactly you are trying to do. But I suggest that you try a regex tester tool - like this one, but there are others. it lets you visually examine the results of regex.