parsekit 对选择器进行意外调用

发布于 2024-11-13 19:46:53 字数 1900 浏览 3 评论 0 原文

我有以下非常简单的(测试)语法文件

@start = expression+;
expression = keyword | otherWord;
otherWord = Word;
keyword = a | the;
a = 'a';
the = 'the';

然后我运行以下代码:

// Grammar contains the contents of the above grammar file.
PKParser *parser = [[PKParserFactory factory] parserFromGrammar:grammar assembler:self];
NSString *s = @"The parrot";
[parser parse:s];
PKReleaseSubparserTree(parser);

以及以下方法:

- (void)didMatchA:(PKAssembly *)a{
    [self log:a type:@"didMatchA          "];
}
- (void)didMatchThe:(PKAssembly *)a{
    [self log:a type:@"didMatchThe        "];
}
- (void)didMatchKeyword:(PKAssembly *)a{
    [self log:a type:@"didMatchKeyword    "];
}
- (void)didMatchExpression:(PKAssembly *)a{
    [self log:a type:@"didMatchExpression "];
}
- (void)didMatchOtherWord:(PKAssembly *)a{
    [self log:a type:@"didMatchOtherWord  "];
}

-(void) log:(PKAssembly *) assembly type:(NSString *) type{
    PKToken * token = [assembly top];
    NSLog(@"Method: [%@], token: %@, assembly: %@", type, token, assembly);
}

最后我在日志中收到这些消息:

[1] Method: [didMatchThe        ], token: The, assembly: [The]The^parrot
[2] Method: [didMatchKeyword    ], token: The, assembly: [The]The^parrot
[3] Method: [didMatchOtherWord  ], token: The, assembly: [The]The^parrot
[4] Method: [didMatchExpression ], token: The, assembly: [The]The^parrot
[5] Method: [didMatchExpression ], token: The, assembly: [The]The^parrot
[6] Method: [didMatchOtherWord  ], token: parrot, assembly: [The, parrot]The/parrot^
[7] Method: [didMatchExpression ], token: parrot, assembly: [The, parrot]The/parrot^

这种情况是有道理的,但我不明白为什么会发生 %5。我真的希望能够删除双重匹配,以便“The”等关键字仅触发 didMatchThe 而不是 didMatchKeyword

不幸的是,parsekit 上的 doco 似乎不存在其语法语法以及它如何决定触发方法。是的,我也研究过源代码:-)

有没有人有过使用 parsekit 的经验并且可以对此有所了解?

I have the following very simple (test) grammar file

@start = expression+;
expression = keyword | otherWord;
otherWord = Word;
keyword = a | the;
a = 'a';
the = 'the';

Then I run the following code:

// Grammar contains the contents of the above grammar file.
PKParser *parser = [[PKParserFactory factory] parserFromGrammar:grammar assembler:self];
NSString *s = @"The parrot";
[parser parse:s];
PKReleaseSubparserTree(parser);

And the following methods:

- (void)didMatchA:(PKAssembly *)a{
    [self log:a type:@"didMatchA          "];
}
- (void)didMatchThe:(PKAssembly *)a{
    [self log:a type:@"didMatchThe        "];
}
- (void)didMatchKeyword:(PKAssembly *)a{
    [self log:a type:@"didMatchKeyword    "];
}
- (void)didMatchExpression:(PKAssembly *)a{
    [self log:a type:@"didMatchExpression "];
}
- (void)didMatchOtherWord:(PKAssembly *)a{
    [self log:a type:@"didMatchOtherWord  "];
}

-(void) log:(PKAssembly *) assembly type:(NSString *) type{
    PKToken * token = [assembly top];
    NSLog(@"Method: [%@], token: %@, assembly: %@", type, token, assembly);
}

And finally I get these messages in the log:

[1] Method: [didMatchThe        ], token: The, assembly: [The]The^parrot
[2] Method: [didMatchKeyword    ], token: The, assembly: [The]The^parrot
[3] Method: [didMatchOtherWord  ], token: The, assembly: [The]The^parrot
[4] Method: [didMatchExpression ], token: The, assembly: [The]The^parrot
[5] Method: [didMatchExpression ], token: The, assembly: [The]The^parrot
[6] Method: [didMatchOtherWord  ], token: parrot, assembly: [The, parrot]The/parrot^
[7] Method: [didMatchExpression ], token: parrot, assembly: [The, parrot]The/parrot^

This sort of makes sense, but I cannot see why %5 occurs. I'd really like to be able to remove the double matching so that keywords such as "The" only trigger didMatchThe and not didMatchKeyword.

Unfortunately the doco on parsekit seems to be non-existant on its grammar syntax and how it decides to trigger methods. Yes, I've trolled the source code too :-)

Has anyone got experience with parsekit and can shed some light on this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

温柔戏命师 2024-11-20 19:46:53

我是 ParseKit 的开发者,这实际上是正确的行为。这里有一些项目可以帮助解决这个问题:

  1. 了解 ParseKit 工作原理的最佳方法是购买 “使用 Java 构建解析器”,作者:Steven John Metsker。 ParseKit 几乎完全基于那里的设计。

  2. ParseKit 的解析器组件非常动态,并且具有无限前瞻功能。这使得它非常适合快速开发或轻松解析小输入,但这也意味着 ParseKit 在解析大型文档时表现出极差的性能。

  3. 由于 ParseKit 的无限前瞻,您实现的汇编器方法将被调用多次。实际上,正如您上面所描述的,它们似乎被调用了太多次。这是正常的。 ParseKit 随时都会探索可用的每一个可能的解析路径,因此您会得到“太多”回调。

  4. 答案是永远不要在汇编器回调方法中处理 ivars。在您的 Assembler 方法中,您应该始终保留当前 PKAssemblytarget ivar 中正在处理的内容的状态。

    a.target

当前的 PKAssembly 是传递给您的回调方法的。

希望有帮助。

I'm the developer of ParseKit, and this is actually correct behavior. Here's a few items to help clear this up:

  1. The best way to learn about how ParseKit works is to buy "Building Parsers with Java" by Steven John Metsker. ParseKit is based almost entirely on the designs laid out there.

  2. ParseKit's parser component is extremely dynamic and features Infinite look-ahead. This makes it ideal for quick development or easily parsing small input, but it also means ParseKit exhibits extremely poor performance when parsing large documents.

  3. Due to ParseKit's infinite look-ahead, the assembler methods you implement will be called many times. Actually, it will appear they will be called too many times as you've described above. This is normal. ParseKit is exploring every possible parse path available to it at any time, so you get "too many" callbacks.

  4. The answer is to never work on ivars in your assembler callback methods. In your Assembler methods, you should instead always keep the state of what you are working on in the current PKAssembly's target ivar.

    a.target

The current PKAssembly is the one passed into your callback method.

Hope that helps.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文