当前位置：文江博客话题详情

更换正则python中的精确分组部分

发布于 2025-01-18 01:24:28 字数 707 浏览 0 评论 0 原文

我有一个模板，需要使用 Python 中的正则表达式替换其中的一部分。这是我的模板：（请注意，两个注释之间至少有一个新行）

hello
how's everything

<!--POSTS:START-->
some text
<!--POSTS:END-->

Some code here

我想替换和 <; 之间的所有内容。 !--POSTS:END-->Python 中的。所以我制作了 \n([^;]*)\n 模式，但它包括和也是如此。

这就是我想要的：

re.sub('...', 'foo', message)

# expected result:
hello
how's everything

<!--POSTS:START-->
foo
<!--POSTS:END-->

Some code here

谢谢。

原文

I have a template that I need to replace a part of that using Regex in Python. Here is my template: (Note that there is at least a new line between two comments)

hello
how's everything

<!--POSTS:START-->
some text
<!--POSTS:END-->

Some code here

I want to replace everything between  and  in Python. So I made \n([^;]*)\n pattern but it includes  and  too.

Here is what I want:

re.sub('...', 'foo', message)

# expected result:
hello
how's everything

<!--POSTS:START-->
foo
<!--POSTS:END-->

Some code here

Thanks.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

探春 2025-01-25 01:24:28

您可以使用捕获组作为开始和结束标记，并在目标替换字符串中将其引用为 \1、\2 等。

如果文本多次出现 ... 则使用 .* 的正则表达式? 将替换每个组。如果'？'删除正则表达式，那么它将删除从第一组开头到最后一组结尾的所有文本。

试试这个：

import re

s = '''
hello
how's everything

<!--POSTS:START-->
some text
<!--POSTS:END-->

Some code here
'''

# for multi-line matching need extra flags in the regexp
s = re.sub(r'(<!--POSTS:START-->\n).*?(\n<!--POSTS:END-->)', r'\1foo\2', s, flags=re.DOTALL)

# this inlines the DOTALL flag in the regexp for same result
# s = re.sub(r'(?s)(<!--POSTS:START-->\n).*?(\n<!--POSTS:END-->)', r'\1foo\2', s)

print(s)

输出：

hello
how's everything

<!--POSTS:START-->
foo
<!--POSTS:END-->

Some code here

You can use a capture group for the start and end markers and reference those as \1, \2, etc in the target replacement string.

If the text has multiple occurrences of ... then the regexp with .*? will replace each of those groups. If the '?' is removed the regexp then it will remove all text from the start of the first group to the end of the last group.

Try this:

import re

s = '''
hello
how's everything

<!--POSTS:START-->
some text
<!--POSTS:END-->

Some code here
'''

# for multi-line matching need extra flags in the regexp
s = re.sub(r'(<!--POSTS:START-->\n).*?(\n<!--POSTS:END-->)', r'\1foo\2', s, flags=re.DOTALL)

# this inlines the DOTALL flag in the regexp for same result
# s = re.sub(r'(?s)(<!--POSTS:START-->\n).*?(\n<!--POSTS:END-->)', r'\1foo\2', s)

print(s)

Output:

hello
how's everything

<!--POSTS:START-->
foo
<!--POSTS:END-->

Some code here

回复收藏 0 原文

执妄 2025-01-25 01:24:28

检查此 https：//docs.python.org/3/library/library/re.html

import re

pattern = r"(<!--POSTS:START-->\n).*(\n<!--POSTS:END-->)"
string = """hello
how's everything

<!--POSTS:START-->
some text
<!--POSTS:END-->

Some code here"""
result = re.sub(pattern, r"\g<1>foo\g<2>", string)
print(result)

结果：

hello
how's everything

<!--POSTS:START-->
foo
<!--POSTS:END-->

Some code here

check this https://docs.python.org/3/library/re.html

import re

pattern = r"(<!--POSTS:START-->\n).*(\n<!--POSTS:END-->)"
string = """hello
how's everything

<!--POSTS:START-->
some text
<!--POSTS:END-->

Some code here"""
result = re.sub(pattern, r"\g<1>foo\g<2>", string)
print(result)

result:

hello
how's everything

<!--POSTS:START-->
foo
<!--POSTS:END-->

Some code here

回复收藏 0 原文

注定孤独终老 2025-01-25 01:24:28

您可以使用以下内容：

import re

new_content = re.sub(
    r'(<!--POSTS:START-->\n).*?(?=\n<!--POSTS:END-->)', r"\1foo",
    content, flags=re.DOTALL)

旗帜dotall：制作'。'。特殊角色完全匹配任何角色，包括newline。

我正在使用两件事来完成您想要的

group lookahead “？=” ：断言，在这里可以匹配给定的子图案，而不会消耗字符
非贪婪的匹配模式（*？）。这将以非贪婪模式匹配。这样，

当我们使用LookAhead， \ n＆lt;！ - 帖子：end - ＆gt; 时，我们将不会消耗所有模式，因此我只需要保留第一组并重写内容在比赛之间。这就是为什么我使用 \ 1foo 而不是 \ 1foo \ 2 ，

如果您仅修改第一匹配项，则可以使用 count = 1

re.sub(..., count=1)

您可以在这两行之间有任何东西，它将按预期工作

you can use the following:

import re

new_content = re.sub(
    r'(<!--POSTS:START-->\n).*?(?=\n<!--POSTS:END-->)', r"\1foo",
    content, flags=re.DOTALL)

The flags DOTALL: Make the '.' special character matches any character at all, including a newline.

I'm using two things to do what you want

Group lookahead "?=": Asserts that the given subpattern can be matched here, without consuming characters
Non greedy match pattern (*?). This will match in a non greedy mode. This way we get all patterns separatly

As we are using lookahead, \n will not be consumed so I only need to keep the first group and rewrite the content between the matches. That is why I'm using \1foo and not \1foo\2

If you need to modify only the first match you can use count=1

re.sub(..., count=1)

You can have anything between those two lines and it will work as expected

回复收藏 0 原文

~没有更多了~

关于作者

失退

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

更换正则python中的精确分组部分

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

尘曦

在梵高的星空下

善良天后

韬韬不绝

qq_CgiN62

不美如何

友情链接

更换正则python中的精确分组部分

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

尘曦

在梵高的星空下

善良天后

韬韬不绝

qq_CgiN62

不美如何

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。