我不明白正则表达式
我正在遵循一个教程 (Ruby),该教程使用正则表达式从字符串中删除所有 html 标签:
product.description.gsub(/<.*?>/,'')
。
我不知道如何解释?
。这是否意味着:“至少之前的一项”?在这种情况下, /<.+>/
不是更合适吗?
I'm following along a tutorial (Ruby) that uses a regex to remove all html tags from a string:
product.description.gsub(/<.*?>/,'')
.
I don't know how to interpret the ?
. Does it mean: "at least one of the previous"? In that case, wouldn't /<.+>/
have been more adequate?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
在这种情况下,它会使
*
变得懒惰。1*
- 匹配尽可能多的1
。1*?
- 匹配尽可能少的1
。在这里,当您有
textsome more text
时,<.*>
将匹配text
。但是,
<.*?>
将匹配和
。
另请参阅: 懒惰而不是贪婪
这里的另一个重要说明是,此正则表达式可以有效的 HTML 很容易失败,最好使用 HTML 解析器并获取文档的文本。
In this case, it make
*
lazy.1*
- match as many1
s as possible.1*?
- match as few1
s as possible.Here, when you have
<a>text<b>some more text
,<.*>
will match<a>text<b>
.<.*?>
, however, will match<a>
and<b>
.See also: Laziness Instead of Greediness
Another important note here is that this regex can easily fail on valid HTML, it is better to use an HTML parser, and get the text of your document.
默认情况下
.*
是 greedy 这意味着它匹配尽量。因此,对于.*
,替换将更改为:如果
您在量词之后使用问号,则它会使其成为非贪婪的,因此它会尽可能少地匹配。使用
.*?
时,替换的工作方式如下:成为:
这与更常见的使用
?
作为量词的情况不同,后者表示“匹配零或一”。无论哪种方式,如果您的文本是 HTML,您都应该使用 HTML 解析器而不是正则表达式。
By default
.*
is greedy which means that it matches as much as possible. So with.*
the replacement would change:to
If you use a question mark after a quantifier it makes it non-greedy, so that it matches as little as possible. With
.*?
the replacement works as follows:Becomes:
This is different from the more common use of
?
as a quantifier where it means 'match zero or one'.Either way if your text is HTML you should use a HTML parser instead of regular expressions.
默认情况下,诸如
*
之类的量词是贪婪的。这意味着它们尽可能匹配。在它们后面添加?
会使它们变得懒惰,因此它们会尽快停止匹配。Quantifiers such as
*
are greedy by default. This means they match as much as possible. Adding?
after them makes them lazy so they stop matching as soon as possible.这是我在正则表达式库之后发现的关于正则表达式的最佳网站:
http://www.wellho.net /regex/java.html
希望有帮助!
that's the best website I found about regex after the regex library:
http://www.wellho.net/regex/java.html
Hope that helps!