pyparsing、前向和递归
我正在使用 pyparsing 来解析 vcd(值更改转储)文件。本质上,我想读入文件,将其解析为内部字典,并操作这些值。
在不详细了解结构的情况下,我的问题发生在识别嵌套类别上。
在 vcd 文件中,您有“范围”,其中包括电线和可能一些更深(嵌套)的范围。将它们想象成级别。
所以在我的文件中,我有:
$scope module toplevel $end
$scope module midlevel $end
$var wire a $end
$var wire b $end
$upscope $end
$var wire c $end
$var wire d $end
$var wire e $end
$scope module extralevel $end
$var wire f $end
$var wire g $end
$upscope $end
$var wire h $end
$var wire i $end
$upscope $end
所以“toplevel”包含所有内容(a - i),“midlevel”包含(a - b),“extralevel”包含(f - g)等。
这是我的代码(片段)解析本节:
scope_header = Group(Literal('$scope') + Word(alphas) + Word(alphas) + \
Literal('$end'))
wire_map = Group(Literal('$var') + Literal('wire') + Word(alphas) + \
Literal('$end'))
scope_footer = Group(Literal('$upscope') + Literal('$end'))
scope = Forward()
scope << (scope_header + ZeroOrMore(wire_map) + ZeroOrMore(scope) + \
ZeroOrMore(wire_map) + scope_footer)
现在,我认为发生的是,当它到达每个作用域时,它将跟踪每个“级别”,并且我最终会得到一个包含嵌套作用域的结构。但是,它错误地
$scope module extralevel $end
表示需要“$upscope”。
所以我知道我没有正确使用递归。有人可以帮我吗?如果我需要提供更多信息,请告诉我。
谢谢!!!!
I'm using pyparsing to parse vcd (value change dump) files. Essentially, I want to read in the files, parse it into an internal dictionary, and manipulate the values.
Without going into details on the structure, my problem occurs with identifying nested categories.
In vcd files, you have 'scopes' which include wires and possibly some deeper (nested) scopes. Think of them like levels.
So in my file, I have:
$scope module toplevel $end
$scope module midlevel $end
$var wire a $end
$var wire b $end
$upscope $end
$var wire c $end
$var wire d $end
$var wire e $end
$scope module extralevel $end
$var wire f $end
$var wire g $end
$upscope $end
$var wire h $end
$var wire i $end
$upscope $end
So 'toplevel' encompasses everything (a - i), 'midlevel' has (a - b), 'extralevel' has (f - g), etc.
Here is my code (snippet) for parsing this section:
scope_header = Group(Literal('$scope') + Word(alphas) + Word(alphas) + \
Literal('$end'))
wire_map = Group(Literal('$var') + Literal('wire') + Word(alphas) + \
Literal('$end'))
scope_footer = Group(Literal('$upscope') + Literal('$end'))
scope = Forward()
scope << (scope_header + ZeroOrMore(wire_map) + ZeroOrMore(scope) + \
ZeroOrMore(wire_map) + scope_footer)
Now, what I thought happens is that as it hits each scope, it would keep track of each 'level' and I would end up with a structure containing nested scopes. However, it errors out on
$scope module extralevel $end
saying it expects '$upscope'.
So I know I'm not using the recursion correctly. Can someone help me out? Let me know if I need to provide more info.
Thanks!!!!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
根据您的定义,一个范围不能包含另一个范围,后跟一些映射,后跟另一个范围。
如果解析器具有打印其解析树的调试模式,您将能够立即看到这一点。但简而言之,你是说有零个或多个映射,后跟零个或多个范围,后跟零个或多个映射,所以如果有一个范围,后跟一个映射,那么你已经通过了范围字段,所以任何更多的范围都无效。如果 pyparsing 使用的语言支持“或”,您可以使用:
According to your definition, a scope cannot contain another scope, followed by some maps, followed by another scope.
If the parser has a debug mode where it prints its parse tree, you will be able to see this immediately. But in short, you're saying there are zero or more maps, followed by zero or more scopes, followed by zero or more maps, so if there is a scope, followed by a map, you have already passed the scope field, so any more scopes are invalid. If the language used by pyparsing supports "or" you could use:
请选择@ZackBloom 的答案作为正确的答案,他立刻凭直觉就知道了,甚至不知道 pyparsing 的语法。
关于您的语法的一些评论/建议:
根据上面发布的答案,您可以在 ParseResults 上使用 pprint 和 pyparsing 的
asList()
方法可视化嵌套:给出:
所以现在您有了结构良好的结果。但你可以稍微清理一下。一方面,现在您已经有了结构,您实际上并不需要所有这些
$scope
、$end
等标记。当然,您可以在浏览解析结果时跳过它们,但也可以让 pyparsing 将它们从解析输出中删除(因为结果现在是结构化的,所以您不会真正丢失任何内容)。将解析器定义更改为:(不需要对
scope_footer
进行分组 - 该表达式中的所有内容都被抑制,因此Group
只会给您一个空列表。)现在您可以看到更清楚地说明真正重要的部分:
冒着分组过多的风险,我建议也对
scope
表达式的内容进行分组
,如下所示:这会给出以下结果:
现在,每个范围结果都有 2 个可预测的元素:模块头和电线或子范围列表。这种可预测性将使编写导航结果的递归代码变得更加容易:
结果看起来像:
好第一个问题,欢迎来到 SO 和 pyparsing!
Please choose @ZackBloom's answer as the correct one, he intuited it right off, without even knowing pyparsing's syntax.
Just a few comments/suggestions on your grammar:
With the answer posted above, you can visualize the nesting using pprint and pyparsing's
asList()
method on ParseResults:Giving:
So now you have nicely structured results. But you can clean things up a bit. For one thing, now that you have structure, you don't really need all those
$scope
,$end
, etc. tokens. You can certainly just step over them as you navigate through the parsed results, but you can also have pyparsing just drop them from the parsed output (since the results are now structured, you're not really losing anything). Change you parser definitions to:(No need to group
scope_footer
- everything in that expression is suppressed, soGroup
would just give you an empty list.)And now you can see more clearly the really important bits:
At the risk of too much grouping, I would suggest also
Group
ing the content of yourscope
expression, like this:which gives these results:
Now every scope result has 2 predictable elements: the module header, and a list of wires or subscopes. This predictability will make it a lot easier to write the recursive code that will navigate the results:
which comes out looking like:
Good first question, welcome to SO and pyparsing!