除了子表达式之外的任何内容
我正在尝试使用 PHP 制作一个正则表达式来识别相对 src 路径。为此,我的想法是使用前瞻 (?= then not ^ 和子表达式 (http),但这不起作用。它适用于单个字符,但 ^ 不适用于子表达式。是否有&& 运算符或其他什么?
<img.*?src=[\'\"]\(?=^(http))
我需要它来获取整个 http,否则以 h、t 或 p 开头的 imgs 会受到影响,有什么建议吗?
I am trying to make a regex to identify relative src paths using PHP. To do this my idea was to use a look ahead (?= then not ^ and a subexpression (http) but this doesn't work. It works for a single charater but the ^ doesn't work with a subexpression. Is there an && operator or something?
<img.*?src=[\'\"]\(?=^(http))
I need it to take the entire http or else imgs with starting with h, t or p will be prejudiced against. Any suggestions? Is this task too big for regex?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以使用负向先行,即
(?!...)
而不是(?=...)
。对于您的示例(我将锚点放在开头):其内容为:字符串开头,然后是不是“http”的内容。
编辑:因为您更新了更完整的示例:
当然,为了正确解析,您应该使用 DOM ;)
You can use negative lookahead, which is
(?!...)
instead of(?=...)
. For your example (I'd put the anchor at the start):Which reads: start of string, then something which is not "http".
Edit: since you updated with a fuller example:
Of course, for proper parsing you should use DOM ;)
这不是最有用的答案,但听起来好像您已经达到了 HTML 解析中正则表达式的适用极限。
根据这里的答案看看使用 HTML DOM 解析器。我很少使用 PHP DOM 解析器,但我知道在其他语言中,DOM 解析器通常会使 HTML 任务成为 30 秒的工作,而不是一个小时或更长时间的奇怪的例外情况测试。
It's not the most useful answer, but it sounds as though you've reached the limit of applicabiliy for Regex in HTML parsing.
As per this answer here look at using a HTML DOM Parser. I haevn't used PHP DOM Parser's much, but I know in other languages, a DOM parser often makes HTML tasks a 30 second job, rather than an hour or more of weird exceptional case testing.