preg_match 匹配 src=、background= 和 url(..)
我想找到一个正则表达式,可以找到(在给定的 HTML 中)以下图像:
- 捕获于:
src=""
- 捕获于:
src=''
- 捕获于:
background=""
- 捕获于:
background=''
- 捕获于:
url("")
- 捕获于:
url('')
- 捕获于: 但这些是
到目前为止,我想出了:
preg_match_all("/src=((\"|'|)?(.*\.(png|gif|jpg))(\"|'|))/Ui", $strHTML, $arrMatches);
preg_match_all("/background=((\"|'|)?(.*\.(png|gif|jpg))(\"|'|))/Ui", $strHTML, $arrMatches);
preg_match_all("/url\((\"|'|)?((.*\.(png|gif|jpg))(\"|'|))\)/Ui", $strHTML, $arrMatches);
不完整的,因为它们不包含前缀(src/background/url)。另外,在安全方面,我认为它们可以进一步改进,以防止有人进入 src="http://somesite.com/someurl.exe?ext=jpg"
任何正确方向的帮助都是赞赏。
编辑:
我想我明白了,尽管代码肯定可以改进,甚至可能组合和/或优化:)
/* match CSS url() links */
preg_match_all("/(url\((\"|'|)(.*\.(png|gif|jpg|jpeg))(\"|'|)\))/Ui", $strHTML, $arrMatches);
Array
(
[0] => Array
(
[0] => url('test1.gif')
[1] => url(test2.gif)
[2] => url("test3.gif")
)
[1] => Array
(
[0] => url('test1.gif')
[1] => url(test2.gif)
[2] => url("test3.gif")
)
[2] => Array
(
[0] => '
[1] =>
[2] => "
)
[3] => Array
(
[0] => test1.gif
[1] => test2.gif
[2] => test3.gif
)
[4] => Array
(
[0] => gif
[1] => gif
[2] => gif
)
[5] => Array
(
[0] => '
[1] =>
[2] => "
)
)
/* match img links */
preg_match_all("/(src=(\"\'??)(.*\.(png|gif|jpg|jpeg))(\"\'??))/Ui", $strHTML, $arrMatches);
/* match background links */
preg_match_all("/(background=(\"\'??)(.*\.(png|gif|jpg|jpeg))(\"\'??))/Ui", $strHTML, $arrMatches);
I would like to find a regular expression that could find (in given HTML) the following images:
- Those captured in:
src=""
- Those captured in:
src=''
- Those captured in:
background=""
- Those captured in:
background=''
- Those captured in:
url("")
- Those captured in:
url('')
- Those captured in:
url()
So far i came up with:
preg_match_all("/src=((\"|'|)?(.*\.(png|gif|jpg))(\"|'|))/Ui", $strHTML, $arrMatches);
preg_match_all("/background=((\"|'|)?(.*\.(png|gif|jpg))(\"|'|))/Ui", $strHTML, $arrMatches);
preg_match_all("/url\((\"|'|)?((.*\.(png|gif|jpg))(\"|'|))\)/Ui", $strHTML, $arrMatches);
But those are incomplete in that they don't include the prefix (src/background/url). Also, security wise I think they can be improved further, to prevent somebody from entering src="http://somesite.com/someurl.exe?ext=jpg"
Any help in the right direction is appreciated.
edit:
I think i got it, although the code can surely be improved, possibly even combined and/or optimized :)
/* match CSS url() links */
preg_match_all("/(url\((\"|'|)(.*\.(png|gif|jpg|jpeg))(\"|'|)\))/Ui", $strHTML, $arrMatches);
Array
(
[0] => Array
(
[0] => url('test1.gif')
[1] => url(test2.gif)
[2] => url("test3.gif")
)
[1] => Array
(
[0] => url('test1.gif')
[1] => url(test2.gif)
[2] => url("test3.gif")
)
[2] => Array
(
[0] => '
[1] =>
[2] => "
)
[3] => Array
(
[0] => test1.gif
[1] => test2.gif
[2] => test3.gif
)
[4] => Array
(
[0] => gif
[1] => gif
[2] => gif
)
[5] => Array
(
[0] => '
[1] =>
[2] => "
)
)
/* match img links */
preg_match_all("/(src=(\"\'??)(.*\.(png|gif|jpg|jpeg))(\"\'??))/Ui", $strHTML, $arrMatches);
/* match background links */
preg_match_all("/(background=(\"\'??)(.*\.(png|gif|jpg|jpeg))(\"\'??))/Ui", $strHTML, $arrMatches);
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您确定这些属性名称(src、url 和背景)...
If you're sure about those attribute names (src,url and background)...