当前位置：文江博客话题详情

将基域添加到目录的正则表达式

发布于 2024-09-11 15:16:53 字数 396 浏览 4 评论 0原文

需要缓存10个网站。缓存时：照片、css、js 等无法正确显示，因为基本域未附加到目录。我需要一个正则表达式将基本域添加到目录中。下面的示例

基域： http://www.example.com

使用 img src 读取缓存页面时会出现问题="thumb/123.jpg" 或 src="/inc/123.js"。

如果是 img src="http://www.example.com/thumb/123.jpg" 或 src="http://www.example.com/inc/123.js"，它们将正确显示。

正则表达式类似于： if (src=") 后面没有基域，则添加基域

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

百思不得你姐 2024-09-18 15:16:54

在不了解语言的情况下，您可以使用（也许是最便携的）替代修饰符：

s/^(src=")([^"]+")$/$1www\.example\.com\/$2/

这应该执行以下操作：
1. 字符串 'src="' （并将其捕获到变量 $1 中）
2. 一个或多个非双引号 (") 字符，后跟 "（并将其捕获到变量 $2 中）
3. 在两个捕获组之间替换“www.example.com/”。

根据语言的不同，您可以将其包装在一个条件中，以检查域是否存在，如果未找到则进行替换。

检查域： /www\.example\.com/i 应该可以。

编辑：参见评论：

对于 PHP，我会做一些不同的事情。我可能会使用 simplexml。不过，我认为这不会很好地翻译，所以这是一个正则表达式...

$html = file_get_contents('/path/to/file.html');
$regex_match = '/(src="|href=")[^(?:www.example.com\/)]([^"]+")/gi';
$regex_substitute = '$1www.example.com/$2';
preg_replace($regex_match, $regex_substitute, $html);

注意：我实际上并没有运行它来调试它，它只是即兴的。我会担心三件事。首先，我不确定 preg_replace 将如何处理 / 字符。不过，我认为您不会关心这个问题，除非 VB 也有类似的问题。其次，如果换行符有可能妨碍，我可能会更改正则表达式。第三，我添加了 [^(?:www\.example\.com)] 位。这应该将匹配更改为任何没有 www.example.com/ 的 src 或 href，但这取决于所使用的正则表达式类型 (POSIX/PCRE)。

其余的更改应该没问题（我添加了 href=" 并使其不区分大小写 (\i)，并且需要将其设置为全局 (\g)，否则，它只会匹配一次）。

我希望有帮助。

without knowing the language, you can use the (maybe most portable) substitute modifier:

s/^(src=")([^"]+")$/$1www\.example\.com\/$2/

This should do the following:
1. the string 'src="' (and capture it in variable $1)
2. one or more non-double-quote (") character followed by " (and capture it in variable $2)
3. Substitutes 'www.example.com/' in between the two capture groups.

Depending on the language, you can wrap this in a conditional that checks for the existence of the domain and substitutes if it isn't found.

to check for domain: /www\.example\.com/i should do.

EDIT: See comments:

For PHP, I would do this a bit differently. I would probably use simplexml. I don't think that will translate well, though, so here's a regex one...

$html = file_get_contents('/path/to/file.html');
$regex_match = '/(src="|href=")[^(?:www.example.com\/)]([^"]+")/gi';
$regex_substitute = '$1www.example.com/$2';
preg_replace($regex_match, $regex_substitute, $html);

Note: I haven't actually run this to debug it, it's just off the cuff. I would be concerned about 3 things. first, I am unsure how preg_replace will handle the / character. I don't think you're concerned with this, though, unless VB has a similar problem. Second, If there's a chance that line breaks would get in the way, I might change the regex. Third, I added the [^(?:www\.example\.com)] bit. This should change the match to any src or href that doesn't have www.example.com/ there, but this depends on the type of regex being used (POSIX/PCRE).

The rest of the changes should be fine (I added href=" and also made it case-insensitive (\i) and there's a requirement to make it global (\g) otherwise, it will just match once).

I hope that helps.

回复收藏 0 原文

嘿咻 2024-09-18 15:16:54

匹配正则表达式：

(?:src|href)="(http://www\.example\.com/)?.+

Matching regular expression:

(?:src|href)="(http://www\.example\.com/)?.+

回复收藏 0 原文

~没有更多了~

关于作者

偷得浮生

暂无简介

0 文章

0 评论

21 人气

关注发私信

友情链接

文江博客

将基域添加到目录的正则表达式

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

qq_FjTq5B

18273202778

WordPress小学生

〃温暖了心ぐ

迷乱花海

niuniu

友情链接

将基域添加到目录的正则表达式

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

qq_FjTq5B

18273202778

WordPress小学生

〃温暖了心ぐ

迷乱花海

niuniu

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。