如何使用pyparsing根据行前缀将行列表分解为单个组
我正在尝试解析命令的输出 IP Netns Exec VPN_NS IPSEC Stroke Statusall
(示例粘贴在下面)。
该命令为每个服务(oof-#n-#i)终结器(#N)提供多行,并使用该终结器(#I)实例,因此
oof-2-1是terminator服务器oof-2实例1。
。
从示例中,我试图获得这样的内容:
results = {
'connections':
{
'oof-1-1': [ 3 lines starting with oof-1-1 in section "Connections" ],
'oof-1-2': [ 3 lines starting with oof-1-2 in section "Connections" ]
'oof-2-1': [ 3 lines starting with oof-2-1 in section "Connections" ]
},
'sec_assocs':
{
'oof-1-1': [ 3 lines starting with oof-1-1 in section "Security Associations" ],
'oof-1-2': [ 3 lines starting with oof-1-2 in section "Security Associations" ]
'oof-2-1': [ 3 lines starting with oof-2-1 in section "Security Associations" ]
}
}
每个ID都包含以其开头的行列表。
这是Strongswan命令的完整输出。
sample = """
Status of IKE charon daemon (strongSwan 5.9.1, Linux 4.15.0-162-generic, x86_64):
uptime: 25 hours, since Mar 23 15:23:53 2022
worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled: 10
loaded plugins: charon aesni
Listening IP addresses:
169.254.123.2
192.168.51.254
Connections:
oof-1-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-1: remote: [server] uses public key authentication
oof-1-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-1-2: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-2: remote: [server] uses public key authentication
oof-1-2: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-2-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-2-1: remote: [server] uses public key authentication
oof-2-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
Security Associations:
oof-1-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-1: remote: [server] uses public key authentication
oof-1-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-1-2: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-2: remote: [server] uses public key authentication
oof-1-2: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-2-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-2-1: remote: [server] uses public key authentication
oof-2-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
"""
这是解析解决方案中使用的样本:
sample = """
Connections:
oof-1-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-1: remote: [server] uses public key authentication
oof-1-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-1-2: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-2: remote: [server] uses public key authentication
oof-1-2: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-2-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-2-1: remote: [server] uses public key authentication
oof-2-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
Security Associations:
oof-1-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-1: remote: [server] uses public key authentication
oof-1-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-1-2: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-2: remote: [server] uses public key authentication
oof-1-2: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-2-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-2-1: remote: [server] uses public key authentication
oof-2-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
"""
I am trying to parse the output of the commandip netns exec vpn_ns ipsec stroke statusall
(example pasted below).
The command provides multiple lines for each service (oof-#n-#i) terminator (#n) and instance using that terminator (#i), so
oof-2-1 is terminator server oof-2 instance 1.
How do I declare a match that collects all the lines prefixed by the same id?
From the example I am trying to get to something like this dict:
results = {
'connections':
{
'oof-1-1': [ 3 lines starting with oof-1-1 in section "Connections" ],
'oof-1-2': [ 3 lines starting with oof-1-2 in section "Connections" ]
'oof-2-1': [ 3 lines starting with oof-2-1 in section "Connections" ]
},
'sec_assocs':
{
'oof-1-1': [ 3 lines starting with oof-1-1 in section "Security Associations" ],
'oof-1-2': [ 3 lines starting with oof-1-2 in section "Security Associations" ]
'oof-2-1': [ 3 lines starting with oof-2-1 in section "Security Associations" ]
}
}
Where each id contains a list of the lines that start with it.
This is the full output from the StrongSwan command.
sample = """
Status of IKE charon daemon (strongSwan 5.9.1, Linux 4.15.0-162-generic, x86_64):
uptime: 25 hours, since Mar 23 15:23:53 2022
worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled: 10
loaded plugins: charon aesni
Listening IP addresses:
169.254.123.2
192.168.51.254
Connections:
oof-1-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-1: remote: [server] uses public key authentication
oof-1-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-1-2: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-2: remote: [server] uses public key authentication
oof-1-2: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-2-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-2-1: remote: [server] uses public key authentication
oof-2-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
Security Associations:
oof-1-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-1: remote: [server] uses public key authentication
oof-1-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-1-2: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-2: remote: [server] uses public key authentication
oof-1-2: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-2-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-2-1: remote: [server] uses public key authentication
oof-2-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
"""
And this is the sample that is used in the parsing solution:
sample = """
Connections:
oof-1-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-1: remote: [server] uses public key authentication
oof-1-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-1-2: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-2: remote: [server] uses public key authentication
oof-1-2: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-2-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-2-1: remote: [server] uses public key authentication
oof-2-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
Security Associations:
oof-1-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-1: remote: [server] uses public key authentication
oof-1-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-1-2: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-2: remote: [server] uses public key authentication
oof-1-2: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-2-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-2-1: remote: [server] uses public key authentication
oof-2-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
"""
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
后处理是使用这种解析数据的最直接方法。这是您试图解析的结构的BNF:
word和int只是一个单词或数字,“ ...”表示重复。
这将pyparsing转换为:
pyparsing将为您生成此铁路图:
这可以解析您的文本,但是为了通过其前缀重新组合线,我们可以添加一个使用
itertools.groupby的解析操作
:通过使用解析操作,在分析时进行重组,因此不需要额外的放置后处理。
现在,我们可以解析您的样本并获得重组结果:
这是一个简短的功能,可以打印出解析的组:
它给出:
这是工作示例的完整来源:
Post-processing is the most direct way to go with this kind of handling of the parsed data. Here is the BNF for the structuring you are trying to parse:
where word and int are just a Word of alphas or nums, and '...' indicates repetition.
This translates to pyparsing as:
Pyparsing will generate this railroad diagram for you:
This parses your text, but to regroup the lines by their prefixes, we can add a parse action that uses
itertools.groupby
:By using a parse action, the regrouping is done at parse time, so no additional post-parsing processing is needed.
Now we can parse your sample and get the regrouped results:
Here is a short function to print out the parsed groups:
Which gives:
Here is the full source for the working example: