正则表达式用根相对链接替换相对链接
我有一串文本,其中包含带有所有不同类型链接(相对、绝对、根相对)的 html。我需要一个可以由 PHP 的 preg_replace
执行的正则表达式,以将所有相对链接替换为根相对链接,而不触及任何其他链接。我已经有了根路径。
替换的链接:
<tag ... href="path/to_file.ext" ... > ---> <tag ... href="/basepath/path/to_file.ext" ... >
<tag ... href="path/to_file.ext" ... /> ---> <tag ... href="/basepath/path/to_file.ext" ... />
未更改的链接:
<tag ... href="/any/path" ... >
<tag ... href="/any/path" ... />
<tag ... href="protocol://domain.com/any/path" ... >
<tag ... href="protocol://domain.com/any/path" ... />
I have a string of text that contains html with all different types of links (relative, absolute, root-relative). I need a regex that can be executed by PHP's preg_replace
to replace all relative links with root-relative links, without touching any of the other links. I have the root path already.
Replaced links:
<tag ... href="path/to_file.ext" ... > ---> <tag ... href="/basepath/path/to_file.ext" ... >
<tag ... href="path/to_file.ext" ... /> ---> <tag ... href="/basepath/path/to_file.ext" ... />
Untouched links:
<tag ... href="/any/path" ... >
<tag ... href="/any/path" ... />
<tag ... href="protocol://domain.com/any/path" ... >
<tag ... href="protocol://domain.com/any/path" ... />
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您只想更改基本 URI,可以尝试
BASE
元素:但请注意,更改基本 URI 会影响所有相对 URI,而不仅仅是相对 URI 路径。
否则,如果您确实想使用正则表达式,请考虑您想要的相对路径必须是 path-noscheme 类型(请参阅 RFC 3986):
因此 URI 的开头必须匹配:
但是请使用正确的 HTML 解析器来解析 HTML 并从中构建 DOM。然后您可以查询 DOM 以获取
href
属性并使用上面的正则表达式测试该值。If you just want to change the base URI, you can try the
BASE
element:But note that changing the base URI affects all relative URIs and not just relative URI paths.
Otherwise, if you really want to use regular expression, consider that a relative path like you want must be of the type path-noscheme (see RFC 3986):
So the begin of the URI must match:
But please use a proper HTML parser for parsing the HTML an build a DOM out of that. Then you can query the DOM to get the
href
attributes and test the value with the regular expression above.我想出了这个:
这可能有点太简单了。我看到的明显缺陷是,当它位于标签之外时,它也会匹配
href="something"
,但希望它可以帮助您入门。I came up with this:
It might be a little too simplistic. The obvious flaw I see is that it will also match
href="something"
when it is outside of a tag, but hopefully it can get you started.