Python正则表达式查找所有数字和点

发布于 2024-07-09 21:58:03 字数 486 浏览 7 评论 0原文

我正在使用 re.findall() 从 HTML 文件中提取一些版本号：

>>> import re
>>> text = "<table><td><a href=\"url\">Test0.2.1.zip</a></td><td>Test0.2.1</td></table> Test0.2.1"
>>> re.findall("Test([\.0-9]*)", text)
['0.2.1.', '0.2.1', '0.2.1']

但我只想获取不以点结尾的版本号。文件名可能并不总是 .zip，所以我不能只在正则表达式中粘贴 .zip。

我想结束：

['0.2.1', '0.2.1']

任何人都可以建议使用更好的正则表达式吗？ :)

原文

I'm using re.findall() to extract some version numbers from an HTML file:

>>> import re
>>> text = "<table><td><a href=\"url\">Test0.2.1.zip</a></td><td>Test0.2.1</td></table> Test0.2.1"
>>> re.findall("Test([\.0-9]*)", text)
['0.2.1.', '0.2.1', '0.2.1']

but I would like to only get the ones that do not end in a dot.
The filename might not always be .zip so I can't just stick .zip in the regex.

I wanna end up with:

['0.2.1', '0.2.1']

Can anyone suggest a better regex to use? :)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

落花随流水 2024-07-16 21:58:03

re.findall(r"Test([0-9.]*[0-9]+)", text)

或者，更短一点：

re.findall(r"Test([\d.]*\d+)", text)

顺便说一句 - 您不需要转义字符类中的点。在 [] 中，. 没有特殊含义，它只是匹配一个文字点。逃避它没有任何效果。

re.findall(r"Test([0-9.]*[0-9]+)", text)

or, a bit shorter:

re.findall(r"Test([\d.]*\d+)", text)

By the way - you do not need to escape the dot in a character class. Inside [] the . has no special meaning, it just matches a literal dot. Escaping it has no effect.

回复收藏 0 原文

~没有更多了~