正则表达式在替换中向后引用超过 9 个值

发布于 2024-09-10 13:06:35 字数 659 浏览 10 评论 0原文

我有一个正则表达式,它遍历一个字符串并提取 40 个值,它看起来像下面的查询一样,但更大、更复杂。

est(.*)/test>test>(.*)<test><test>(.*)test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test>

我的问题是,当数字超过 9 时,如何将这些表达式与替换命令一起使用。似乎每当我使用 \10 时,它都会返回 \1 的值,然后在末尾附加一个 0

任何帮助将不胜感激,谢谢:)

我也使用 UEStudio,但如果不同的程序做得更好,那么没什么大不了的:)

I have a regex expression that traverses a string and pulls out 40 values, it looks sort if like the query below, but much larger and more complicated

est(.*)/test>test>(.*)<test><test>(.*)test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test><test>(.*)/test>

My question is how do I use these expressions with the replace command when the number exceeds 9. It seems as if whenever I use \10 it returns the value for \1 and then appends a 0 to the end.

Any help would be much appreciated thanks :)

Also I am using UEStudio, but if a different program does it better then no biggie :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

我还不会笑 2024-09-17 13:06:35

正如 psycho brm 所指出的:
使用 10 美元代替 10 美元
我正在使用记事本++,它工作得很好。

As pointed out by psycho brm:
Use $10 instead of \10
I am using notepad++ and it works beautifull.

故笙诉离歌 2024-09-17 13:06:35

编辑器使用的大多数简单正则表达式引擎都无法处理超过 10 个匹配组; UltraEdit 似乎无法做到这一点。我刚刚尝试了 Notepad++,它甚至无法匹配具有 10 个组的正则表达式。

我认为,你最好的选择是用快速的语言和像样的正则表达式解析器快速编写一些东西。 但这并不能回答所问的问题

这是 Python 中的一些内容:

import re

pattern = re.compile('(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)')
with open('input.txt', 'r') as f:
    for line in f:
        m = pattern.match(line)
        print m.groups()

请注意,Python 允许反向引用,例如 \20:为了对组 2 进行反向引用,后跟文字 0,您需要使用 \g<2>0,这是明确的。

编辑:
大多数风格的正则表达式和包含正则表达式引擎的编辑器应遵循如下替换语法:

abcdefghijklmnop
search: (.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(?<name>.)(.)
note:    1  2  3  4  5  6  7  8  9  10 11 12 13
value:   a  b  c  d  e  f  g  h  i  j  k  l  m
replace result:
    \11      k1      i.e.: match 1, then the character "1"
    ${12}    l       most should support this
    ${name}  l       few support named references, but use them where you can.

命名引用通常只能在非常特定风格的正则表达式库中使用,请测试您的工具以确保确定。

Most of the simple Regex engines used by editors aren't equipped to handle more than 10 matching groups; it doesn't seem like UltraEdit can. I just tried Notepad++ and it won't even match a regex with 10 groups.

Your best bet, I think, is to write something fast in a quick language with a decent regex parser. but that wouldn't answer the question as asked

Here's something in Python:

import re

pattern = re.compile('(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)')
with open('input.txt', 'r') as f:
    for line in f:
        m = pattern.match(line)
        print m.groups()

Note that Python allows backreferences such as \20: in order to have a backreference to group 2 followed by a literal 0, you need to use \g<2>0, which is unambiguous.

Edit:
Most flavors of regex, and editors which include a regex engine, should follow the replace syntax as follows:

abcdefghijklmnop
search: (.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(?<name>.)(.)
note:    1  2  3  4  5  6  7  8  9  10 11 12 13
value:   a  b  c  d  e  f  g  h  i  j  k  l  m
replace result:
    \11      k1      i.e.: match 1, then the character "1"
    ${12}    l       most should support this
    ${name}  l       few support named references, but use them where you can.

Named references are usually only possible in very specific flavor of regex libraries, test your tool to know for sure.

緦唸λ蓇 2024-09-17 13:06:35

在两位数子组前面放一个 $:例如 \1\2\3\4\5\6\7\8\9$10 它对我有用。

put a $ in front of the double digit subgroup: e.g. \1\2\3\4\5\6\7\8\9$10 It worked for me.

蔚蓝源自深海 2024-09-17 13:06:35

尝试使用命名组;因此,不要使用第十个:

(.*)

使用:

(?<group10>.*)

,然后使用以下替换字符串:(

${group10}

当然,这是在没有使用循环的更好解决方案的情况下,请记住,根据您的环境,可能会有不同的正则表达式语法风格。)

Try using named groups; so instead of the tenth:

(.*)

use:

(?<group10>.*)

and then use the following replace string:

${group10}

(That's of course in the absence of a better solution using looping, and remember that there might be different regex syntax flavours depending on your environment.)

念﹏祤嫣 2024-09-17 13:06:35

如果您无法处理超过 9 个子组,为什么不首先匹配 9 个子组,然后循环并将正则表达式应用于这些匹配项?

即首先匹配 ()+,然后为每个子组匹配

If you cannot handle more than 9 subgroups why not initially match groups of 9 and then loop and apply regexes to those matches?

i.e. first match (<test.*/test>)+ and then for each subgroup match on <test(.*)/test>.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文