YouTube 网址 - 正则表达式
我的 antisamy 策略文件中有以下配置:
旧 YouTube 对象:
<object width="1280" height="720">
<param
name="movie"
value="http://www.youtube.com/v/Hl-zzrqQoSE
?version=3
&hl=en_US
&rel=0">
</param>
<param name="allowFullScreen" value="true">
</param>
<param name="allowscriptaccess" value="always">
</param>
<embed src="http://www.youtube.com/v/Hl-zzrqQoSE
?version=3
&hl=en_US
&rel=0"
type="application/x-shockwave-flash"
width="1280"
height="720"
allowscriptaccess="always"
allowfullscreen="true">
</embed>
</object>
AntiSamy 配置:
<common-regexps>
<regexp name="YouTubeURL" value="(\s)*(http(s?)://)www.youtube.com/v/[\p{L}\p{N}]+[\p{L}\p{N}\p{Zs}\.\#@\$%\+&;:\-_~,\?=/!]*(\s)*"/>
....
<!-- Tags related to YouTube -->
<tag name="object" action="validate">
<attribute name="height"/>
<attribute name="width"/>
<attribute name="type">
<literal-list>
<literal value="application/x-shockwave-flash"/>
</literal-list>
</attribute>
<attribute name="data">
<regexp-list>
<regexp name="YouTubeURL"/>
</regexp-list>
</attribute>
</tag>
<tag name="embed" action="validate">
<attribute name="height"/>
<attribute name="width"/>
<attribute name="type">
<literal-list>
<literal value="application/x-shockwave-flash"/>
</literal-list>
</attribute>
<attribute name="allowfullscreen">
<regexp-list>
<regexp name="boolean"/>
</regexp-list>
</attribute>
<attribute name="allowscriptaccess">
<literal-list>
<literal value="always"/>
</literal-list>
</attribute>
<attribute name="src">
<regexp-list>
<regexp name="YouTubeURL"/>
</regexp-list>
</attribute>
<attribute name="movie">
<regexp-list>
<regexp name="YouTubeURL"/>
</regexp-list>
</attribute>
</tag>
目前我在 iframe 上的配置:
<!-- Frame & related tags -->
<tag name="iframe" action="remove"/>
<tag name="frameset" action="remove"/>
<tag name="frame" action="remove"/>
新 YouTube iframe:
<iframe
width="1280"
height="720"
<!-- src="https://www.youtube-nocookie.com/embed/Hl-zzrqQoSE" -->
src="https://www.youtube.com/embed/Hl-zzrqQoSE"
frameborder="0"
allowfullscreen>
</iframe>
我认为 iframe 的代码应该是这样的:
<tag name="iframe" action="validate">
<attribute name="height"/>
<attribute name="width"/>
<attribute name="frameborder"/>
<attribute name="src">
<regexp-list>
<regexp name="YouTubeURL"/>
</regexp-list>
</attribute>
<attribute name="allowfullscreen">
<regexp-list>
<regexp name="boolean"/>
</regexp-list>
</attribute>
</tag>
如何更改正则表达式,以便它接受新旧链接如:
https://www.youtube-nocookie.com/embed/Hl-zzrqQoSE
https://www.youtube.com/embed/Hl-zzrqQoSE
https://www.youtube.com/v/Hl-zzrqQoSE
http://www.youtube-nocookie.com/v/Hl-zzrqQoSE?version=3&hl=en_US&rel=0
http://www.youtube.com/v/Hl-zzrqQoSE?version=3&hl=en_US&rel=0"
I have following config in my antisamy policy file:
Old YouTube Object:
<object width="1280" height="720">
<param
name="movie"
value="http://www.youtube.com/v/Hl-zzrqQoSE
?version=3
&hl=en_US
&rel=0">
</param>
<param name="allowFullScreen" value="true">
</param>
<param name="allowscriptaccess" value="always">
</param>
<embed src="http://www.youtube.com/v/Hl-zzrqQoSE
?version=3
&hl=en_US
&rel=0"
type="application/x-shockwave-flash"
width="1280"
height="720"
allowscriptaccess="always"
allowfullscreen="true">
</embed>
</object>
The AntiSamy config:
<common-regexps>
<regexp name="YouTubeURL" value="(\s)*(http(s?)://)www.youtube.com/v/[\p{L}\p{N}]+[\p{L}\p{N}\p{Zs}\.\#@\$%\+&;:\-_~,\?=/!]*(\s)*"/>
....
<!-- Tags related to YouTube -->
<tag name="object" action="validate">
<attribute name="height"/>
<attribute name="width"/>
<attribute name="type">
<literal-list>
<literal value="application/x-shockwave-flash"/>
</literal-list>
</attribute>
<attribute name="data">
<regexp-list>
<regexp name="YouTubeURL"/>
</regexp-list>
</attribute>
</tag>
<tag name="embed" action="validate">
<attribute name="height"/>
<attribute name="width"/>
<attribute name="type">
<literal-list>
<literal value="application/x-shockwave-flash"/>
</literal-list>
</attribute>
<attribute name="allowfullscreen">
<regexp-list>
<regexp name="boolean"/>
</regexp-list>
</attribute>
<attribute name="allowscriptaccess">
<literal-list>
<literal value="always"/>
</literal-list>
</attribute>
<attribute name="src">
<regexp-list>
<regexp name="YouTubeURL"/>
</regexp-list>
</attribute>
<attribute name="movie">
<regexp-list>
<regexp name="YouTubeURL"/>
</regexp-list>
</attribute>
</tag>
Currently my config on iframe:
<!-- Frame & related tags -->
<tag name="iframe" action="remove"/>
<tag name="frameset" action="remove"/>
<tag name="frame" action="remove"/>
The new YouTube iframe:
<iframe
width="1280"
height="720"
<!-- src="https://www.youtube-nocookie.com/embed/Hl-zzrqQoSE" -->
src="https://www.youtube.com/embed/Hl-zzrqQoSE"
frameborder="0"
allowfullscreen>
</iframe>
I figure the code for iframe should like this:
<tag name="iframe" action="validate">
<attribute name="height"/>
<attribute name="width"/>
<attribute name="frameborder"/>
<attribute name="src">
<regexp-list>
<regexp name="YouTubeURL"/>
</regexp-list>
</attribute>
<attribute name="allowfullscreen">
<regexp-list>
<regexp name="boolean"/>
</regexp-list>
</attribute>
</tag>
How do you change the regex so it will accept the old and new links like:
https://www.youtube-nocookie.com/embed/Hl-zzrqQoSE
https://www.youtube.com/embed/Hl-zzrqQoSE
https://www.youtube.com/v/Hl-zzrqQoSE
http://www.youtube-nocookie.com/v/Hl-zzrqQoSE?version=3&hl=en_US&rel=0
http://www.youtube.com/v/Hl-zzrqQoSE?version=3&hl=en_US&rel=0"
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我冒昧地删除了不必要的捕获组、逃逸和角色。
尽管我个人会使用类似的方法:
这会将整个 youtube URL 放入匹配组 0 中,将视频 ID 放入匹配组 1 中。
此外,当 youtube 的 URL 不包含 unicode 字符时,使用 unicode 属性也没有多大意义。
演示:http://rubular.com/r/jv4zO9ys2L
I took the liberty to remove unnecessary capture groups, escapes and characters.
Although I personally would use something like:
That puts the entire youtube URL in match group 0 and the video id in match group 1.
Also it doesn't make a whole lot of sense to use unicode properties when youtube's URLs don't contain unicode characters.
Demo: http://rubular.com/r/jv4zO9ys2L