从所有 中获取包含特定关键字的 src 值标签

发布于 2025-01-02 09:17:01 字数 308 浏览 2 评论 0原文

我正在尝试匹配 src="URL" 标签,如下所示:

src="http://3.bp.blogspot.com/-ulEY6FtwbtU/Twye18FlT4I/AAAAAAAAAEE/CHuAAgfQU2Q/s320/DSC_0045.JPG"

基本上,任何在 src 属性内具有某种 bp.blogspot URL 的内容。我有以下内容,但仅部分有效:

preg_match('/src=\"(.*)blogspot(.*)\"/', $content, $matches);

I'm trying to match src="URL" tags like the following:

src="http://3.bp.blogspot.com/-ulEY6FtwbtU/Twye18FlT4I/AAAAAAAAAEE/CHuAAgfQU2Q/s320/DSC_0045.JPG"

Basically, anything that has somre sort of bp.blogspot URL inside of the src attribute. I have the following, but it's only partially working:

preg_match('/src=\"(.*)blogspot(.*)\"/', $content, $matches);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

野侃 2025-01-09 09:17:01

这个接受所有 blogspot url 并允许转义引号:

src="((?:[^"]|(?:(?<!\\)(?:\\\\)*\\"))+\bblogspot\.com/(?:[^"]|(?:(?<!\\)(?:\\\\)*\\"))+)"

捕获 URL 以匹配组 1。

您需要使用额外的 \< 转义 \/ /code> (对于每次出现!)在 preg_match(…) 中使用。

解释:

src=" # needle 1
( # start of capture group
    (?: # start of anonymous group
        [^"] # non-quote chars
        | # or:
        (?:(?<!\\)(?:\\\\)*\\") # escaped chars
    )+ # end of anonymous group
    \b # start of word (word boundary)
    blogspot\.com/ # needle 2
    (?: # start of anonymous group
        [^"] # non-quote chars
        | # or:
        (?:(?<!\\)(?:\\\\)*\\") # escaped chars
    )+ # end of anonymous group
    ) # end of capture group
" # needle 3

This one accepts all blogspot urls and allows escaped quotes:

src="((?:[^"]|(?:(?<!\\)(?:\\\\)*\\"))+\bblogspot\.com/(?:[^"]|(?:(?<!\\)(?:\\\\)*\\"))+)"

The URL gets captured to match group 1.

You will need to escape \ and / with an additional \ (for each occurence!) to use in preg_match(…).

Explanation:

src=" # needle 1
( # start of capture group
    (?: # start of anonymous group
        [^"] # non-quote chars
        | # or:
        (?:(?<!\\)(?:\\\\)*\\") # escaped chars
    )+ # end of anonymous group
    \b # start of word (word boundary)
    blogspot\.com/ # needle 2
    (?: # start of anonymous group
        [^"] # non-quote chars
        | # or:
        (?:(?<!\\)(?:\\\\)*\\") # escaped chars
    )+ # end of anonymous group
    ) # end of capture group
" # needle 3
Smile简单爱 2025-01-09 09:17:01

使用 XPath 仅定位 stc 值包含 blogspot 的 img 标签。

代码:(演示)

$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$result = [];
foreach ($xpath->query("//img[contains(@src, 'blogspot')]/@src") as $src) {
    $result[] = $src->nodeValue;
}
var_export($result);

Use XPath to target only the img tags which have a stc value containing blogspot.

Code: (Demo)

$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$result = [];
foreach ($xpath->query("//img[contains(@src, 'blogspot')]/@src") as $src) {
    $result[] = $src->nodeValue;
}
var_export($result);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文