我可以使用 re.sub (或 regexobject.sub)来替换子组中的文本吗?
我需要解析一个如下所示的配置文件(简化的):
<config>
<links>
<link name="Link1" id="1">
<encapsulation>
<mode>ipsec</mode>
</encapsulation>
</link>
<link name="Link2" id="2">
<encapsulation>
<mode>udp</mode>
</encapsulation>
</link>
</links>
我的目标是能够更改特定于特定链接的参数,但我无法使替换正常工作。 我有一个正则表达式,可以隔离特定链接上的参数值,其中该值包含在捕获组 1 中:
link_id = r'id="1"'
parameter = 'mode'
link_regex = '<link [\w\W]+ %s>[\w\W]*[\w\W]*<%s>([\w\W]*)</%s>[\w\W]*</link>' \
% (link_id, parameter, parameter)
因此,
print re.search(final_regex, f_read).group(1)
打印 regex howto 中的示例
似乎都假设人们想要使用捕获组在替换中,但我需要做的是替换捕获组本身(例如将 Link1 模式从 ipsec 更改为 udp)。
I need to parse a configuration file which looks like this (simplified):
<config>
<links>
<link name="Link1" id="1">
<encapsulation>
<mode>ipsec</mode>
</encapsulation>
</link>
<link name="Link2" id="2">
<encapsulation>
<mode>udp</mode>
</encapsulation>
</link>
</links>
My goal is to be able to change parameters specific to a particular link, but I'm having trouble getting substitution to work correctly. I have a regex that can isolate a parameter value on a specific link, where the value is contained in capture group 1:
link_id = r'id="1"'
parameter = 'mode'
link_regex = '<link [\w\W]+ %s>[\w\W]*[\w\W]*<%s>([\w\W]*)</%s>[\w\W]*</link>' \
% (link_id, parameter, parameter)
Thus,
print re.search(final_regex, f_read).group(1)
prints
ipsec
The examples in the regex howto all seem to assume that one wants to use the capture group in the replacement, but what I need to do is replace the capture group itself (e.g. change the Link1 mode from ipsec to udp).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我必须强制要求您:“不要使用正则表达式来执行此操作。”
看看使用 BeautifulSoup,例如:
查看你的正则表达式,我真的无法判断这是否正是你想要做的,但无论你想要做什么,使用像 BeautifulSoup 这样的库比尝试修补要好得多一起使用正则表达式。 如果可能的话,我强烈建议走这条路。
I have to give you the obligatory: "don't use regular expressions to do this."
Check out how very easily awesome it is to do this with BeautifulSoup, for example:
Looking at your regular expression I can't really tell if this is exactly what you wanted to do, but whatever it is you want to do, using a library like BeautifulSoup is much, much, better than trying to patch a regular expression together. I highly recommend going this route if possible.
这看起来像有效的 XML,在这种情况下,您不需要 BeautifulSoup,绝对不需要正则表达式,只需使用任何好的 XML 库加载 XML,编辑它并打印出来,这是使用 ElementTree 的方法:
它将更改所有模式元素到
udp
,这是输出:This looks like valid XML, in that case you don't need BeautifulSoup, definitely not the regex, just load XML using any good XML library, edit it and print it out, here is a approach using ElementTree:
It will change all mode elements to
udp
, this is the output:假设你的 link_regex 是正确的,你可以像这样添加括号:
然后你可以这样做:
Supposing that your link_regex is correct, you can add parenthesis like this:
and then you could do:
不确定我会这样做,但最快的方法是转移捕获:
([\w\W][\w\W]<%s>)[\w \W]([\w\W])' 并替换为 group1 +mode+group2
not sure i'd do it that way, but the quickest way would be to shift the captures:
([\w\W][\w\W]<%s>)[\w\W]([\w\W])' and replace with group1 +mode+group2