parsekit 对选择器进行意外调用
我有以下非常简单的(测试)语法文件
@start = expression+;
expression = keyword | otherWord;
otherWord = Word;
keyword = a | the;
a = 'a';
the = 'the';
然后我运行以下代码:
// Grammar contains the contents of the above grammar file.
PKParser *parser = [[PKParserFactory factory] parserFromGrammar:grammar assembler:self];
NSString *s = @"The parrot";
[parser parse:s];
PKReleaseSubparserTree(parser);
以及以下方法:
- (void)didMatchA:(PKAssembly *)a{
[self log:a type:@"didMatchA "];
}
- (void)didMatchThe:(PKAssembly *)a{
[self log:a type:@"didMatchThe "];
}
- (void)didMatchKeyword:(PKAssembly *)a{
[self log:a type:@"didMatchKeyword "];
}
- (void)didMatchExpression:(PKAssembly *)a{
[self log:a type:@"didMatchExpression "];
}
- (void)didMatchOtherWord:(PKAssembly *)a{
[self log:a type:@"didMatchOtherWord "];
}
-(void) log:(PKAssembly *) assembly type:(NSString *) type{
PKToken * token = [assembly top];
NSLog(@"Method: [%@], token: %@, assembly: %@", type, token, assembly);
}
最后我在日志中收到这些消息:
[1] Method: [didMatchThe ], token: The, assembly: [The]The^parrot
[2] Method: [didMatchKeyword ], token: The, assembly: [The]The^parrot
[3] Method: [didMatchOtherWord ], token: The, assembly: [The]The^parrot
[4] Method: [didMatchExpression ], token: The, assembly: [The]The^parrot
[5] Method: [didMatchExpression ], token: The, assembly: [The]The^parrot
[6] Method: [didMatchOtherWord ], token: parrot, assembly: [The, parrot]The/parrot^
[7] Method: [didMatchExpression ], token: parrot, assembly: [The, parrot]The/parrot^
这种情况是有道理的,但我不明白为什么会发生 %5。我真的希望能够删除双重匹配,以便“The”等关键字仅触发 didMatchThe
而不是 didMatchKeyword
。
不幸的是,parsekit 上的 doco 似乎不存在其语法语法以及它如何决定触发方法。是的,我也研究过源代码:-)
有没有人有过使用 parsekit 的经验并且可以对此有所了解?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我是 ParseKit 的开发者,这实际上是正确的行为。这里有一些项目可以帮助解决这个问题:
了解 ParseKit 工作原理的最佳方法是购买 “使用 Java 构建解析器”,作者:Steven John Metsker。 ParseKit 几乎完全基于那里的设计。
ParseKit 的解析器组件非常动态,并且具有无限前瞻功能。这使得它非常适合快速开发或轻松解析小输入,但这也意味着 ParseKit 在解析大型文档时表现出极差的性能。
由于 ParseKit 的无限前瞻,您实现的汇编器方法将被调用多次。实际上,正如您上面所描述的,它们似乎被调用了太多次。这是正常的。 ParseKit 随时都会探索可用的每一个可能的解析路径,因此您会得到“太多”回调。
答案是永远不要在汇编器回调方法中处理 ivars。在您的 Assembler 方法中,您应该始终保留当前
PKAssembly
的target
ivar 中正在处理的内容的状态。a.target
当前的
PKAssembly
是传递给您的回调方法的。希望有帮助。
I'm the developer of ParseKit, and this is actually correct behavior. Here's a few items to help clear this up:
The best way to learn about how ParseKit works is to buy "Building Parsers with Java" by Steven John Metsker. ParseKit is based almost entirely on the designs laid out there.
ParseKit's parser component is extremely dynamic and features Infinite look-ahead. This makes it ideal for quick development or easily parsing small input, but it also means ParseKit exhibits extremely poor performance when parsing large documents.
Due to ParseKit's infinite look-ahead, the assembler methods you implement will be called many times. Actually, it will appear they will be called too many times as you've described above. This is normal. ParseKit is exploring every possible parse path available to it at any time, so you get "too many" callbacks.
The answer is to never work on ivars in your assembler callback methods. In your Assembler methods, you should instead always keep the state of what you are working on in the current
PKAssembly
'starget
ivar.a.target
The current
PKAssembly
is the one passed into your callback method.Hope that helps.