通过解析输出文件创建复杂的数据结构
我正在寻找一些有关如何通过解析文件创建数据结构的建议。 这是我的文件中的列表。
'01bpar( 2)= 0.23103878E-01 half_life= 0.3000133E+02 relax_time= 0.4328278E+02',
'01bpar( 3)= 0.00000000E+00',
'02epar( 1)= 0.49998963E+02',
'02epar( 2)= 0.23103878E-01 half_life= 0.3000133E+02 relax_time= 0.4328278E+02',
'02epar( 3)= 0.00000000E+00',
'02epar( 4)= 0.17862340E-01 half_life= 0.3880495E+02 relax_time= 0.5598371E+02',
'02bpar( 1)= 0.49998962E+02',
'02bpar( 2)= 0.23103878E-01 half_life= 0.3000133E+02 relax_time= 0.4328278E+02',
我需要做的是构建一个如下所示的数据结构:
http://img11 .imageshack.us/img11/7645/datastruct.gif
(由于新用户限制而无法发布)
我已设法将所有正则表达式过滤器设置为得到需要的东西,但我无法构建结构。 有想法吗?
I'm looking for some advice on how to create a data structure by parsing a file.
This is the list i have in my file.
'01bpar( 2)= 0.23103878E-01 half_life= 0.3000133E+02 relax_time= 0.4328278E+02',
'01bpar( 3)= 0.00000000E+00',
'02epar( 1)= 0.49998963E+02',
'02epar( 2)= 0.23103878E-01 half_life= 0.3000133E+02 relax_time= 0.4328278E+02',
'02epar( 3)= 0.00000000E+00',
'02epar( 4)= 0.17862340E-01 half_life= 0.3880495E+02 relax_time= 0.5598371E+02',
'02bpar( 1)= 0.49998962E+02',
'02bpar( 2)= 0.23103878E-01 half_life= 0.3000133E+02 relax_time= 0.4328278E+02',
What I need to do is construct a data structure which chould look like this:
http://img11.imageshack.us/img11/7645/datastructure.gif
(couldn't post it becouse of new user restriction)
I've managed to get all the regexp filters to get what is needed, but i fail to construct the structure.
Ideas?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
理论上可以让 pyparsing 使用解析操作创建整个结构,但如果您只是像下面那样命名各个字段,那么构建结构也不错。如果您想转换为使用 RE,这个示例应该让您开始了解事情的外观
:
It's theoretically possible to have pyparsing create the whole structure using parse actions, but if you just name the various fields as I have below, building up the structure is not too bad. And if you want to convert to using RE's, this example should give you a start on how things might look:
Prints:
考虑使用字典的字典。
产生:
根据评论修订版本:
产生:
Consider using a dict of dicts.
Produces:
Revised version in light of comments:
which produces:
您的顶层结构是位置性的,因此它是列表的完美选择。由于列表可以容纳任意项目,因此 命名元组 是完美的。元组中的每个项目都可以包含一个包含其元素的列表。
所以,你的代码应该看起来像这样的伪代码:
你说你已经可以循环文件,并且有各种正则表达式来获取数据,所以我没有费心添加所有细节。
Your top level structure is positional, so it's a perfect choice for a list. Since lists can hold arbitrary items, then a named tuple is perfect. Each item in the tuple can hold a list with it's elements.
So, your code should look something like this pseudocode:
You said you could already loop over the file, and had various regex to get the data, so I didn't bother adding all the details.