如何在 Coco/R 中标记缩进(Python、Boo 等缩进)

发布于 2024-10-05 01:29:05 字数 349 浏览 3 评论 0原文

有一种众所周知的方法,如何在 Coco/R 中实现像 Python/Boo 中那样的标记缩进?

Coco/R 忽略空格,但我需要以某种方式根据下一行缩进生成 beginBlock/endBlock 标记。

现在,我使用预处理器,它插入“{”、“}”和“;”在输入流中,通过比较行之间的缩进。在 Coco/R 语法中,我使用弯括号作为 beginBlock/endBlock 标记。如果输入流没有注释(也可以嵌套),它会很好地工作。一旦出现无序评论,意图比较逻辑就会失败。

对我来说,实现一个跟踪评论的预处理器看起来像是过度设计。

那么问题是,通常可以用 Coco/R 解析基于缩进的语法吗? 或者我应该尝试别的东西?

It there a well known way, how to implement in Coco/R tokenizing indents like in Python/Boo?

Coco/R ignores whitespaces, but I need somehow to generate beginBlock/endBlock tokens, based on next line indent.

Right now, I use preprocessor, which inserts '{', '}', and ';' in input stream, by comparing indents between lines. In Coco/R grammar I use curved braces as beginBlock/endBlock tokens. It works well if input stream has no commens (which could be also nested). As soon as unordered comments coming, intentation comparison logic fails.

Implementing a preprocessor, which traces a comments looks like overenginering to me.

So the question is, is it generally possible to parse indent based grammar with Coco/R?
Or should I try something else?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

往昔成烟 2024-10-12 01:29:05

找到了一个理想的方法来做到这一点。

  • 使用以下方法包装 GetNextToken
    比较下一个的流位置
    令牌与最后一个。

  • 如果position.Y改变了,但是
    position.X 增加N个tab,注入N
    虚拟 INDENT 标记。

  • 如果position.Y改变了,但是
    位置.X减少N个选项卡,注入N个
    虚拟 DENDENT 代币。

  • 如果position.Y改变了,但是
    position.X不是,注入virtual
    SEPARATOR 令牌。

  • 如果position.Y没有改变,则返回
    原始下一个标记。

  • 如果之前的标记是软中断(在 python \ 中),
    忽略上面的逻辑。

Found a ideal way to do this.

  • wrap GetNextToken with method that
    compares stream positions of the next
    token with the last one.

  • if position.Y is changed, but
    position.X increased N tabs, inject N
    virtual INDENT tokens.

  • if position.Y is changed, but
    position.X decreased N tabs, inject N
    virtual DENDENT tokens.

  • if position.Y is changed, but
    position.X is not, inject virtual
    SEPARATOR token.

  • if position.Y is not changed, return
    original next token.

  • if previous token was a soft break (in python \),
    ignore logic above.

三生路 2024-10-12 01:29:05

首先,Coco/R 默认情况下只忽略空格(空格)。选项卡不会被忽略:

2.3.5 空白
空格、制表符或行尾等字符
符号通常被认为是
应该忽略的空白
扫描仪。空格被忽略
默认。如果其他角色应该
也被忽略,用户也必须
按以下方式指定它们:

WhiteSpaceDecl =“忽略”设置。

示例忽略 '\t' + '\r' + '\n'

我还没有对此进行测试,但我的猜测是您应该覆盖扫描仪的默认行为:

Token NextToken() {
    while (ch == ' ' ||
        false
    ) NextCh();

最简单的方法是首先修改生成的代码,直到它正常工作,然后在框架文件(Scanner.frameParser.frame)中进行相同的更改,这样在重新生成后就不会丢失更改代码。

First of all, Coco/R only ignores blanks (spaces) by default. Tabs are not ignored:

2.3.5 White space
Characters such as blanks, tabulators or end-of-line
symbols are usually considered as
white space that should be ignored by
the scanner. Blanks are ignored by
default. If other characters should
be ignored as well the user has to
specify them in the following way:

WhiteSpaceDecl = "IGNORE" Set.

Example IGNORE '\t' + '\r' + '\n'

I haven't tested this, but my guess is that you should overwrite the default behavior of the Scanner:

Token NextToken() {
    while (ch == ' ' ||
        false
    ) NextCh();

The easiest way to do this is by first modifying the generated code until it works OK and then making the same changes in the frame files (Scanner.frame and Parser.frame) so you won't loose the changes after you regenerate the code.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文