python re.search (regex) 只搜索具有类似 {{world}} 模式的单词
我在 HTML 文件中插入了自定义标签,例如 {{name}}
、{{surname}}
。现在我想搜索与 {{world}}
等模式完全匹配的标签,甚至不包括 {world}}
、{{world}
>、{world}
、{ word }
、{{ world }}
等。 我编写了一段小代码,
re.findall(r'\{(\w.+?)\}', html_string)
它返回遵循模式 {{world}} ,{world},{world}} 的单词 我不想要的。我想完全匹配 {{world}}。有人可以指导我吗?
I have on HTML file in which I have inserted the custom tags like {{name}}
, {{surname}}
. Now I want to search the tags who exactly match the pattern like {{world}}
only not even {world}}
, {{world}
, {world}
, { word }
, {{ world }}
, etc.
I wrote the small code for the
re.findall(r'\{(\w.+?)\}', html_string)
It returns the words which follow the pattern {{world}} ,{world},{world}}
that I don't want. I want to match exactly the {{world}}. Can anybody please guide me?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
嗯,正则表达式不应该是:
好的,在评论之后,我更了解您的要求:
应该适合您。
基本上,您需要 {{any nnumber 个单词字符,包括下划线}}。在这种情况下,实际上您甚至不需要惰性匹配,因此您可以删除表达式中的
?
。像
{{keyword1}} other stuff {{keyword2}}
这样的东西现在不会作为一个整体匹配。要仅获取关键字而不获取 {{}},请使用以下命令:
Um, shouldn't the regex be:
Ok, after the comments, I understand your requirements more:
should work for you.
Basically, you want {{any nnumber of word characters including underscore}}. You don't even need the lazy match in this case actually so you may remove th
?
in the expression.Something like
{{keyword1}} other stuff {{keyword2}}
will not match as a whole now.To get only the keyword without getting the {{}} use below:
这个怎么样?
或者,如果您希望结果中包含大括号:
不过,如果您尝试完成 html 模板化,我建议使用 良好的模板引擎。
How about this?
Or, if you want the curly braces included in the results:
If you're trying to accomplish html templating, though, I recommend using a good template engine.
这将与您的结果中的大括号不匹配,您想要吗?
http://rubular.com/r/79YwR13MS0
This will match no curly braces within your result, do you want that?
http://rubular.com/r/79YwR13MS0
如果您想匹配双花括号,您应该在正则表达式中指定它们:
If you want to match doubled curly brackets, you should specify them in your regex:
您说其他答案不起作用,但它们似乎对我来说:
如果它对您不起作用,您将需要提供更多详细信息。
编辑:以下怎么样?去掉点 (
.
) 并仅使用\w
还允许您使用贪婪限定符,并适用于注释中的示例 HTML:\w
匹配字母数字字符和下划线;如果您需要匹配更多字符,您可以将其添加到一个集合中(例如,[\w\+]
也可以匹配加号)。You say the other answers don't work, but they seem to for me:
If it doesn't work for you, you'll need to give more details.
Edit: How about the following? Getting rid of the dot (
.
) and using only\w
also allows you to use greedy qualifiers and works for the example HTML from your comment:The
\w
matches alphanumeric characters and the underscore; if you need to match more characters you could add it to a set (e.g.,[\w\+]
to also match the plus sign).