re.sub 替换为匹配的内容

发布于 2024-12-01 12:34:28 字数 656 浏览 3 评论 0原文

为了掌握 Python 中的正则表达式，我尝试输出一些在 URL 中突出显示的 HTML。我的输入是

images/:id/size

我的输出应该是

images/<span>:id</span>/size

如果我在 Javascript 中执行此操作，

method = 'images/:id/size';
method = method.replace(/\:([a-z]+)/, '<span>$1</span>')
alert(method)

我会得到所需的结果，但如果我在 Python 中执行此操作，

>>> method = 'images/:id/huge'
>>> re.sub('\:([a-z]+)', '<span>$1</span>', method)
'images/<span>$1</span>/huge'

则不会，如何让 Python 返回正确的结果而不是 $1？ re.sub 是执行此操作的正确函数吗？

原文

Trying to get to grips with regular expressions in Python, I'm trying to output some HTML highlighted in part of a URL. My input is

images/:id/size

my output should be

images/<span>:id</span>/size

If I do this in Javascript

method = 'images/:id/size';
method = method.replace(/\:([a-z]+)/, '<span>$1</span>')
alert(method)

I get the desired result, but if I do this in Python

>>> method = 'images/:id/huge'
>>> re.sub('\:([a-z]+)', '<span>$1</span>', method)
'images/<span>$1</span>/huge'

I don't, how do I get Python to return the correct result rather than $1? Is re.sub even the right function to do this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

旧时光的容颜 2024-12-08 12:34:28

只需使用 \1 而不是$1：

In [1]: import re

In [2]: method = 'images/:id/huge'

In [3]: re.sub(r'(:[a-z]+)', r'<span>\1</span>', method)
Out[3]: 'images/<span>:id</span>/huge'

另请注意原始字符串 (r'...') 用于正则表达式。它不是强制性的，但消除了转义反斜杠的需要，可以说使代码更具可读性。

Simply use \1 instead of $1:

In [1]: import re

In [2]: method = 'images/:id/huge'

In [3]: re.sub(r'(:[a-z]+)', r'<span>\1</span>', method)
Out[3]: 'images/<span>:id</span>/huge'

Also note the use of raw strings (r'...') for regular expressions. It is not mandatory but removes the need to escape backslashes, arguably making the code slightly more readable.

回复收藏 0 原文

请别遗忘我 2024-12-08 12:34:28

整个匹配值的反向引用是 \g<0>，请参阅 re.sub 文档：

反向引用 \g<0> 替换 RE 匹配的整个子字符串。

请参阅 Python 演示：

import re
method = 'images/:id/huge'
print(re.sub(r':[a-z]+', r'<span>\g<0></span>', method))
# => images/<span>:id</span>/huge

如果您需要执行不区分大小写的搜索，请添加 flag=re.I< /代码>：

re.sub(r':[a-z]+', r'<span>\g<0></span>', method, flags=re.I)

A backreference to the whole match value is \g<0>, see re.sub documentation:

The backreference \g<0> substitutes in the entire substring matched by the RE.

See the Python demo:

import re
method = 'images/:id/huge'
print(re.sub(r':[a-z]+', r'<span>\g<0></span>', method))
# => images/<span>:id</span>/huge

If you need to perform a case insensitive search, add flag=re.I:

re.sub(r':[a-z]+', r'<span>\g<0></span>', method, flags=re.I)

回复收藏 0 原文

恋竹姑娘 2024-12-08 12:34:28

使用 \1 而不是 $1。

\number 匹配相同编号的组内容。

http://docs.python.org/library/re.html#regular -表达式语法

回复收藏 0 原文

芯好空 2024-12-08 12:34:28

对于替换部分，Python 使用 sed 和 vi 的方式使用 \1，而不是像 Perl、Java 和 Javascript（以及其他）那样使用 $1 ）做。此外，由于 \1 在常规字符串中插入字符 U+0001，因此您需要使用原始字符串或 \escape 它。

Python 3.2 (r32:88445, Jul 27 2011, 13:41:33) 
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> method = 'images/:id/huge'
>>> import re
>>> re.sub(':([a-z]+)', r'<span>\1</span>', method)
'images/<span>id</span>/huge'
>>>

For the replacement portion, Python uses \1 the way sed and vi do, not $1 the way Perl, Java, and Javascript (amongst others) do. Furthermore, because \1 interpolates in regular strings as the character U+0001, you need to use a raw string or \escape it.

Python 3.2 (r32:88445, Jul 27 2011, 13:41:33) 
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> method = 'images/:id/huge'
>>> import re
>>> re.sub(':([a-z]+)', r'<span>\1</span>', method)
'images/<span>id</span>/huge'
>>>

回复收藏 0 原文

~没有更多了~