如何使用pyparsing根据行前缀将行列表分解为单个组

发布于 2025-01-17 12:27:22 字数 3702 浏览 1 评论 0原文

我正在尝试解析命令的输出 IP Netns Exec VPN_NS IPSEC Stroke Statusall(示例粘贴在下面)。

该命令为每个服务(oof-#n-#i)终结器(#N)提供多行,并使用该终结器(#I)实例,因此

oof-2-1是terminator服务器oof-2实例1。

从示例中,我试图获得这样的内容:

results = {
    'connections':
        {
            'oof-1-1': [ 3 lines starting with oof-1-1 in section "Connections" ],
            'oof-1-2': [ 3 lines starting with oof-1-2 in section "Connections" ]
            'oof-2-1': [ 3 lines starting with oof-2-1 in section "Connections" ]
        },

    'sec_assocs':
        {
            'oof-1-1': [ 3 lines starting with oof-1-1 in section "Security Associations" ],
            'oof-1-2': [ 3 lines starting with oof-1-2 in section "Security Associations" ]
            'oof-2-1': [ 3 lines starting with oof-2-1 in section "Security Associations" ]
        }
}

每个ID都包含以其开头的行列表。

这是Strongswan命令的完整输出。

sample = """
Status of IKE charon daemon (strongSwan 5.9.1, Linux 4.15.0-162-generic, x86_64):
  uptime: 25 hours, since Mar 23 15:23:53 2022
  worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled: 10
  loaded plugins: charon aesni 
Listening IP addresses:
  169.254.123.2
  192.168.51.254
Connections:
     oof-1-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-1:   remote: [server] uses public key authentication
     oof-1-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-1-2:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-2:   remote: [server] uses public key authentication
     oof-1-2:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-2-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-2-1:   remote: [server] uses public key authentication
     oof-2-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
Security Associations:
     oof-1-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-1:   remote: [server] uses public key authentication
     oof-1-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-1-2:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-2:   remote: [server] uses public key authentication
     oof-1-2:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-2-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-2-1:   remote: [server] uses public key authentication
     oof-2-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
"""

这是解析解决方案中使用的样本:

sample = """
Connections:
     oof-1-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-1:   remote: [server] uses public key authentication
     oof-1-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-1-2:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-2:   remote: [server] uses public key authentication
     oof-1-2:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-2-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-2-1:   remote: [server] uses public key authentication
     oof-2-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
Security Associations:
     oof-1-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-1:   remote: [server] uses public key authentication
     oof-1-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-1-2:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-2:   remote: [server] uses public key authentication
     oof-1-2:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-2-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-2-1:   remote: [server] uses public key authentication
     oof-2-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
"""

I am trying to parse the output of the command
ip netns exec vpn_ns ipsec stroke statusall (example pasted below).

The command provides multiple lines for each service (oof-#n-#i) terminator (#n) and instance using that terminator (#i), so

oof-2-1 is terminator server oof-2 instance 1.

How do I declare a match that collects all the lines prefixed by the same id?

From the example I am trying to get to something like this dict:

results = {
    'connections':
        {
            'oof-1-1': [ 3 lines starting with oof-1-1 in section "Connections" ],
            'oof-1-2': [ 3 lines starting with oof-1-2 in section "Connections" ]
            'oof-2-1': [ 3 lines starting with oof-2-1 in section "Connections" ]
        },

    'sec_assocs':
        {
            'oof-1-1': [ 3 lines starting with oof-1-1 in section "Security Associations" ],
            'oof-1-2': [ 3 lines starting with oof-1-2 in section "Security Associations" ]
            'oof-2-1': [ 3 lines starting with oof-2-1 in section "Security Associations" ]
        }
}

Where each id contains a list of the lines that start with it.

This is the full output from the StrongSwan command.

sample = """
Status of IKE charon daemon (strongSwan 5.9.1, Linux 4.15.0-162-generic, x86_64):
  uptime: 25 hours, since Mar 23 15:23:53 2022
  worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled: 10
  loaded plugins: charon aesni 
Listening IP addresses:
  169.254.123.2
  192.168.51.254
Connections:
     oof-1-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-1:   remote: [server] uses public key authentication
     oof-1-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-1-2:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-2:   remote: [server] uses public key authentication
     oof-1-2:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-2-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-2-1:   remote: [server] uses public key authentication
     oof-2-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
Security Associations:
     oof-1-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-1:   remote: [server] uses public key authentication
     oof-1-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-1-2:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-2:   remote: [server] uses public key authentication
     oof-1-2:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-2-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-2-1:   remote: [server] uses public key authentication
     oof-2-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
"""

And this is the sample that is used in the parsing solution:

sample = """
Connections:
     oof-1-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-1:   remote: [server] uses public key authentication
     oof-1-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-1-2:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-2:   remote: [server] uses public key authentication
     oof-1-2:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-2-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-2-1:   remote: [server] uses public key authentication
     oof-2-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
Security Associations:
     oof-1-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-1:   remote: [server] uses public key authentication
     oof-1-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-1-2:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-1-2:   remote: [server] uses public key authentication
     oof-1-2:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
     oof-2-1:  %any...10.1.0.242  IKEv2, dpddelay=30s
     oof-2-1:   remote: [server] uses public key authentication
     oof-2-1:   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
"""

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

榕城若虚 2025-01-24 12:27:23

后处理是使用这种解析数据的最直接方法。这是您试图解析的结构的BNF:

group ::= label ':' line...
label ::= word...
line ::= prefix ':' rest_of_line
prefix ::= word '-' int '-' int

word和int只是一个单词或数字,“ ...”表示重复。

这将pyparsing转换为:

import pyparsing as pp

COLON = pp.Suppress(":")
label = pp.Combine(
            pp.Word(pp.alphas)[1, ...], adjacent=False, joinString=" "
            )
prefix = pp.Combine(
            pp.Word(pp.alphas) + "-" + pp.Word(pp.nums) + "-" + pp.Word(pp.nums)
            )
post_prefix = COLON + pp.restOfLine
line = pp.Group(prefix("prefix") + post_prefix)
lines = pp.Group(line[...])
group = pp.Group(label("group_label") + COLON + lines("subgroups"))

pyparsing将为您生成此铁路图:

“

这可以解析您的文本,但是为了通过其前缀重新组合线,我们可以添加一个使用itertools.groupby的解析操作

def regroup_lines(t):
    from itertools import groupby
    from operator import itemgetter

    ret = pp.ParseResults([])
    parsed_lines = t[0]
    for prefix, subgroup in groupby(parsed_lines, key=itemgetter("prefix")):
        # each line in subgroup has the prefix and the rest of the line after the ':'
        # repackage the multiple lines into a single group that is labeled with 
        # the common prefix, and contains the line contents
        ret.append(pp.ParseResults.from_dict(
            {
                'prefix': prefix,
                'lines': [line[1] for line in subgroup],
            }
        ))
    return ret

lines.add_parse_action(regroup_lines)

通过使用解析操作,在分析时进行重组,因此不需要额外的放置后处理。

现在,我们可以解析您的样本并获得重组结果:

results = group[...].parseString(sample)

这是一个简短的功能,可以打印出解析的组:

def print_groups(parsed):
    for group in parsed:
        print(group.group_label)
        for subgroup in group.subgroups:
            print(f"- {subgroup.prefix}")
            for line in subgroup.lines:
                print(f"  {line!r}")
        print()

print_groups(results)

它给出:

Connections
- oof-1-1
  '  %any...10.1.0.242  IKEv2, dpddelay=30s'
  '   remote: [server] uses public key authentication'
  '   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart'
- oof-1-2
  '  %any...10.1.0.242  IKEv2, dpddelay=30s'
  '   remote: [server] uses public key authentication'
  '   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart'
- oof-2-1
  '  %any...10.1.0.242  IKEv2, dpddelay=30s'
  '   remote: [server] uses public key authentication'
  '   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd'

Security Associations
- oof-1-1
  '  %any...10.1.0.242  IKEv2, dpddelay=30s'
  '   remote: [server] uses public key authentication'
  '   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart'
- oof-1-2
  '  %any...10.1.0.242  IKEv2, dpddelay=30s'
  '   remote: [server] uses public key authentication'
  '   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart'
- oof-2-1
  '  %any...10.1.0.242  IKEv2, dpddelay=30s'
  '   remote: [server] uses public key authentication'
  '   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd'

这是工作示例的完整来源:

import pyparsing as pp

COLON = pp.Suppress(":")
label = pp.Combine(pp.Word(pp.alphas)[1, ...], adjacent=False, joinString=" ")
label.setName("label")
prefix = pp.Combine(pp.Word(pp.alphas) + "-" + pp.Word(pp.nums) + "-" + pp.Word(pp.nums))
prefix.setName("prefix")
post_prefix = COLON + pp.restOfLine
line = pp.Group(prefix("prefix") + post_prefix)
lines = pp.Group(line[...])


def regroup_lines(t):
    from itertools import groupby
    from operator import itemgetter

    ret = pp.ParseResults([])
    for prefix, subgroup in groupby(t[0], key=itemgetter("prefix")):
        ret.append(pp.ParseResults.from_dict(
            {
                'prefix': prefix,
                'lines': [line[1] for line in subgroup],
            }
        ))
    return ret
lines.add_parse_action(regroup_lines)

group = pp.Group(label("group_label") + COLON + lines("subgroups"))
pp.autoname_elements()
group.create_diagram("groupby_1.html", show_results_names=True)
results = group[...].parseString(sample)


def print_groups(parsed):
    for group in parsed:
        print(group.group_label)
        for subgroup in group.subgroups:
            print(f"- {subgroup.prefix}")
            for line in subgroup.lines:
                print(f"  {line!r}")
        print()

print_groups(results)

Post-processing is the most direct way to go with this kind of handling of the parsed data. Here is the BNF for the structuring you are trying to parse:

group ::= label ':' line...
label ::= word...
line ::= prefix ':' rest_of_line
prefix ::= word '-' int '-' int

where word and int are just a Word of alphas or nums, and '...' indicates repetition.

This translates to pyparsing as:

import pyparsing as pp

COLON = pp.Suppress(":")
label = pp.Combine(
            pp.Word(pp.alphas)[1, ...], adjacent=False, joinString=" "
            )
prefix = pp.Combine(
            pp.Word(pp.alphas) + "-" + pp.Word(pp.nums) + "-" + pp.Word(pp.nums)
            )
post_prefix = COLON + pp.restOfLine
line = pp.Group(prefix("prefix") + post_prefix)
lines = pp.Group(line[...])
group = pp.Group(label("group_label") + COLON + lines("subgroups"))

Pyparsing will generate this railroad diagram for you:

parser railroad diagram

This parses your text, but to regroup the lines by their prefixes, we can add a parse action that uses itertools.groupby:

def regroup_lines(t):
    from itertools import groupby
    from operator import itemgetter

    ret = pp.ParseResults([])
    parsed_lines = t[0]
    for prefix, subgroup in groupby(parsed_lines, key=itemgetter("prefix")):
        # each line in subgroup has the prefix and the rest of the line after the ':'
        # repackage the multiple lines into a single group that is labeled with 
        # the common prefix, and contains the line contents
        ret.append(pp.ParseResults.from_dict(
            {
                'prefix': prefix,
                'lines': [line[1] for line in subgroup],
            }
        ))
    return ret

lines.add_parse_action(regroup_lines)

By using a parse action, the regrouping is done at parse time, so no additional post-parsing processing is needed.

Now we can parse your sample and get the regrouped results:

results = group[...].parseString(sample)

Here is a short function to print out the parsed groups:

def print_groups(parsed):
    for group in parsed:
        print(group.group_label)
        for subgroup in group.subgroups:
            print(f"- {subgroup.prefix}")
            for line in subgroup.lines:
                print(f"  {line!r}")
        print()

print_groups(results)

Which gives:

Connections
- oof-1-1
  '  %any...10.1.0.242  IKEv2, dpddelay=30s'
  '   remote: [server] uses public key authentication'
  '   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart'
- oof-1-2
  '  %any...10.1.0.242  IKEv2, dpddelay=30s'
  '   remote: [server] uses public key authentication'
  '   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart'
- oof-2-1
  '  %any...10.1.0.242  IKEv2, dpddelay=30s'
  '   remote: [server] uses public key authentication'
  '   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd'

Security Associations
- oof-1-1
  '  %any...10.1.0.242  IKEv2, dpddelay=30s'
  '   remote: [server] uses public key authentication'
  '   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart'
- oof-1-2
  '  %any...10.1.0.242  IKEv2, dpddelay=30s'
  '   remote: [server] uses public key authentication'
  '   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart'
- oof-2-1
  '  %any...10.1.0.242  IKEv2, dpddelay=30s'
  '   remote: [server] uses public key authentication'
  '   child:  dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd'

Here is the full source for the working example:

import pyparsing as pp

COLON = pp.Suppress(":")
label = pp.Combine(pp.Word(pp.alphas)[1, ...], adjacent=False, joinString=" ")
label.setName("label")
prefix = pp.Combine(pp.Word(pp.alphas) + "-" + pp.Word(pp.nums) + "-" + pp.Word(pp.nums))
prefix.setName("prefix")
post_prefix = COLON + pp.restOfLine
line = pp.Group(prefix("prefix") + post_prefix)
lines = pp.Group(line[...])


def regroup_lines(t):
    from itertools import groupby
    from operator import itemgetter

    ret = pp.ParseResults([])
    for prefix, subgroup in groupby(t[0], key=itemgetter("prefix")):
        ret.append(pp.ParseResults.from_dict(
            {
                'prefix': prefix,
                'lines': [line[1] for line in subgroup],
            }
        ))
    return ret
lines.add_parse_action(regroup_lines)

group = pp.Group(label("group_label") + COLON + lines("subgroups"))
pp.autoname_elements()
group.create_diagram("groupby_1.html", show_results_names=True)
results = group[...].parseString(sample)


def print_groups(parsed):
    for group in parsed:
        print(group.group_label)
        for subgroup in group.subgroups:
            print(f"- {subgroup.prefix}")
            for line in subgroup.lines:
                print(f"  {line!r}")
        print()

print_groups(results)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文