仅当锚点的 URL 包含时,才将锚点剥离至其内容
有谁知道 PHP 中的正则表达式函数,仅当锚点的 href 属性包含特定文本时才能剥离其内容的锚点?
例如,我有一个 HTML 页面,并且整个页面都有链接。但我只想删除 URL 中包含“yahoo”的锚点。因此 Example page
将变为:Example,而 HTML 中的其他锚点不包含“yahoo”将被独自留下。
Does anyone know a regex function in PHP to strip an anchor of its contents, only if the anchor's href attribute contains specific text?
For example, I have an HTML page and there are links throughout. But I want to strip only the anchors that contain "yahoo" in the URL. So <a href="http://pages.yahoo.com/page1">Example page</a>
would become: Example, while other anchors in the HTML not containing "yahoo" would be left alone.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
首先,这不是正则表达式问题(或者至少不应该是)。 PHP 附带了一个 HTML 解析器,因此我强烈建议使用它。
使用时,您只需要循环遍历所有锚标记,检查 href 属性并根据需要进行修改,然后将其保存回 HTML。例如:
使用
parse_url()
这里是可选的。您可以简单地检查属性值中是否有“yahoo”,而无需仅提取主机名。对于同一问题,这比任何基于正则表达式的解决方案明显更好、更健壮。
Firstly, this isn't a regex problem (or at least it shouldn't be). PHP comes with an HTML parser so I'd strongly recommend using that.
When you use that you just need to loop through all the anchor tags, check the href attribute and modify if necessary then save it back to HTML. For example:
Using
parse_url()
here is optional. You could simply check if the attribute value had "yahoo" anywhere in it without pulling out just the host name.This is significantly better and more robust than any regex based solution for the same problem.
试试这个功能。
您可以像这样使用它 stripAnchorTags($html);
如果您希望它忽略 yahoo 链接,请像这样调用它 stripAnchorTags($html, "yahoo");
Try this function.
You can use it like this stripAnchorTags($html);
If you want it to ignore yahoo links then call it like this stripAnchorTags($html, "yahoo");