匹配所有“http”仅限没有附加字符的 URL

发布于 2024-09-09 10:21:23 字数 364 浏览 5 评论 0原文

我尝试过下面的表达方式。

(http:\/\/.*?)['\"\< \>]


(http:\/\/[-a-zA-Z0-9+&@#\/%?=~_|!:,.;\"]*[-a-zA-Z0-9+&@#\/%=~_|\"])

第一个做得很好，但总是给出最后一个额外的字符和匹配的 URL。

例如：

http://domain.com/path.html" 

http://domain.com/path.html<

请注意

" <

，我不希望它们带有 URL。

原文

I have tried the below expressions.

(http:\/\/.*?)['\"\< \>]


(http:\/\/[-a-zA-Z0-9+&@#\/%?=~_|!:,.;\"]*[-a-zA-Z0-9+&@#\/%=~_|\"])

The first one is doing well but always gives the last extra character with the matched URLs.

Eg:

http://domain.com/path.html" 

http://domain.com/path.html<

Notice

" <

I don't want them with URLs.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

撕心裂肺的伤痛 2024-09-16 10:21:23

您可以使用前瞻而不是使 ['\"\< >] 成为匹配的一部分，即：

(http:\/\/.*?)(?=['\"\< >])

一般来说，而 ab 匹配 ab （如果后面跟着 b）。

`a`匹配

code>, a(?=b)与 www.regular-expressions.info/lookaround.html" rel="nofollow noreferrer">regular-expressions.info/Lookarounds

捕获组选项

Lookarounds 并非所有风格都支持。

一般来说，虽然 (a)b 仍然匹配 ab，但它也捕获 a<。 /code> 在第 1 组中。

参考文献

regular-expressions.info/Round Brackets for Grouping

否定字符类选项

根据需要，通常使用否定字符类比使用不情愿的 .*? （后跟前瞻来断言终止符模式）要好得多在这种情况下）。

让我们考虑匹配“A 和 ZZ 之间的所有内容”的问题。事实证明，这个规范是不明确的：我们将提出 3 种模式来执行此操作，并且它们将产生不同的匹配。哪一个是“正确的”取决于预期，而原始陈述中没有正确传达这一点。

我们使用以下内容作为输入：

eeAiiZooAuuZZeeeZZfff

我们使用 3 种不同的模式：

A(.*)ZZ 产生 1 个匹配：AiiZooAuuZZeeeZZ (如 ideone.com 上所见）
- 这是贪婪变体；第 1 组匹配并捕获iiZooAuuZZeee
A(.*?)ZZ 产生 1 个匹配：AiiZooAuuZZ (如 ideone.com 上所示）
- 这是不情愿的变体；第 1 组匹配并捕获iiZooAuu
A([^Z]*)ZZ 产生 1 个匹配：AuuZZ (如 ideone.com 上所示）
- 这是否定字符类变体；第 1 组匹配并捕获uu

以下是它们匹配内容的直观表示：

         ___n
        /   \              n = negated character class
eeAiiZooAuuZZeeeZZfff      r = reluctant
  \_________/r   /         g = greedy
   \____________/g

参考文献

regular-expressions.info/Character Class 和重复：懒惰的替代方法

References

regular-expressions.info/Lookarounds

Capturing group option

Lookarounds are not supported by all flavors. More widely supported are capturing groups.

Generally speaking, whereas (a)b still matches ab, it also captures a in group 1.

References

regular-expressions.info/Round Brackets for Grouping

Negated character class option

Depending on the need, often times using a negated character class is much better than using a reluctant .*? (followed by a lookahead to assert the terminator pattern in this case).

Let's consider the problem of matching "everything between A and ZZ". As it turns out, this specification is ambiguous: we will come up with 3 patterns that does this, and they will yield different matches. Which one is "correct" depends on the expectation, which is not properly conveyed in the original statement.

We use the following as input:

eeAiiZooAuuZZeeeZZfff

We use 3 different patterns:

A(.*)ZZ yields 1 match: AiiZooAuuZZeeeZZ (as seen on ideone.com)
- This is the greedy variant; group 1 matched and captured iiZooAuuZZeee
A(.*?)ZZ yields 1 match: AiiZooAuuZZ (as seen on ideone.com)
- This is the reluctant variant; group 1 matched and captured iiZooAuu
A([^Z]*)ZZ yields 1 match: AuuZZ (as seen on ideone.com)
- This is the negated character class variant; group 1 matched and captured uu

Here's a visual representation of what they matched:

         ___n
        /   \              n = negated character class
eeAiiZooAuuZZeeeZZfff      r = reluctant
  \_________/r   /         g = greedy
   \____________/g

References

regular-expressions.info/Character Class and Repetition: An Alternative to Laziness

关于作者

余厌

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

匹配所有“http”仅限没有附加字符的 URL

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

`a`匹配

相关问题

捕获组选项

参考文献

相关问题

否定字符类选项

参考文献

相关问题

References

Related questions

Capturing group option

References

Related questions

Negated character class option

References

Related questions

关于作者

相关话题

热门标签

推荐作者

娇女薄笑

biaggi

xiaolangfanhua

rivulet

我三岁

薆情海

友情链接

匹配所有“http”仅限没有附加字符的 URL

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

a匹配

相关问题

捕获组选项

参考文献

相关问题

否定字符类选项

参考文献

相关问题

References

Related questions

Capturing group option

References

Related questions

Negated character class option

References

Related questions

关于作者

相关话题

热门标签

推荐作者

娇女薄笑

biaggi

xiaolangfanhua

rivulet

我三岁

薆情海

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

`a`匹配