Boost::Spirit 中的字符列解析

发布于 2024-08-05 01:31:27 字数 378 浏览 1 评论 0原文

我正在为 Fortran 77 的一小部分开发基于 Boost Spirit 2.0 的解析器。我的问题我的问题是 Fortran 77 是面向列的,而我在 Spirit 中找不到任何可以允许其解析器具有列感知能力的内容。有什么办法可以做到这一点吗?

我实际上不必支持完整的晦涩的 Fortran 语法,但它确实需要能够忽略第一列中包含字符的行(Fortran 注释),并将第六列中包含字符的行识别为延续行。

似乎处理批处理文件的人至少会遇到与我相同的第一列问题。 Spirit 似乎有一个行尾解析器,但没有一个行首解析器(当然也不是一个column(x) 解析器)。

I'm working on a Boost Spirit 2.0 based parser for a small subset of Fortran 77. The issue I'm having is that Fortran 77 is column oriented, and I have been unable to find anything in Spirit that can allow its parsers to be column-aware. Is there any way to do this?

I don't really have to support the full arcane Fortran syntax, but it does need to be able to ignore lines that have a character in the first column (Fortran comments), and recognize lines with a character in the sixth column as continuation lines.

It seems like folks dealing with batch files would at least have the same first-column problem as me. Spirit appears to have an end-of-line parser, but not a start-of-line parser (and certianly not a column(x) parser).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

浪漫之都 2024-08-12 01:31:27

好吧,既然我现在有了答案,我想我应该分享它。

Fortran 77 可能像所有其他关心列的语言一样,是一种面向行的语言。这意味着您的解析器必须跟踪 EOL 并在解析中实际使用它。

另一个重要的事实是,就我而言,我并不关心解析 Fortran 可以放入那些早期控制列中的行号。我所需要的只是知道它何时告诉我以不同的方式扫描该行的其余部分。

考虑到这两点,我完全可以使用 Spirit 跳过解析器来处理这个问题。 我写的是

  • 如果第一(注释)列包含字母字符,
  • 跳过整行。如果整行上没有任何内容,则跳过整行。
  • 如果第五列包含“.”,则忽略前面的 EOL 以及第五列之前的所有内容(续行)。这会将其固定到前一行。
  • 跳过所有非 eol 空白(即使是空格在 Fortran 中也无关紧要。是的,这是一种奇怪的语言。)

这是代码:

        skip = 
            // Full line comment
            (spirit::eol >> spirit::ascii::alpha >> *(spirit::ascii::char_  - spirit::eol))
            [boost::bind (&fortran::parse_info::skipping_line, &pi)]
        |  
            // remaining line comment
            (spirit::ascii::char_ ('!') >> *(spirit::ascii::char_ - spirit::eol)
             [boost::bind (&fortran::parse_info::skipping_line_comment, &pi)])
        |
            // Continuation
            (spirit::eol >> spirit::ascii::blank >> 
             spirit::qi::repeat(4)[spirit::ascii::char_ - spirit::eol] >> ".")
            [boost::bind (&fortran::parse_info::skipping_continue, &pi)]

        |   
            // empty line 
            (spirit::eol >> 
             -(spirit::ascii::blank >> spirit::qi::repeat(0, 4)[spirit::ascii::char_ - spirit::eol] >> 
               *(spirit::ascii::blank) ) >> 
             &(spirit::eol | spirit::eoi))
            [boost::bind (&fortran::parse_info::skipping_empty, &pi)]
        |   
            // whitespace (this needs to be the last alternative).
            (spirit::ascii::space - spirit::eol)
            [boost::bind (&fortran::parse_info::skipping_space, &pi)]
        ;

我建议不要盲目地将其用于面向行的 Fortran,因为我忽略行号,并且不同编译器对于有效注释和连续字符有不同的规则。

Well, since I now have an answer to this, I guess I should share it.

Fortran 77, like probably all other languages that care about columns, is a line-oriented language. That means your parser has to keep track of the EOL and actually use it in its parsing.

Another important fact is that in my case, I didn't care about parsing the line numbers that Fortran can put in those early control columns. All I need is to know when it is telling me to scan rest of the line differently.

Given those two things, I could entirely handle this issue with a Spirit skip parser. I wrote mine to

  • skip the entire line if the first (comment) column contains an alphabetic charater.
  • skip the entire line if there is nothing on it.
  • ignore the preceeding EOL and everything up to the fifth column if the fifth column contains a '.' (continuation line). This tacks it to the preceeding line.
  • skip all non-eol whitespace (even spaces don't matter in Fortran. Yes, it's a wierd language.)

Here's the code:

        skip = 
            // Full line comment
            (spirit::eol >> spirit::ascii::alpha >> *(spirit::ascii::char_  - spirit::eol))
            [boost::bind (&fortran::parse_info::skipping_line, &pi)]
        |  
            // remaining line comment
            (spirit::ascii::char_ ('!') >> *(spirit::ascii::char_ - spirit::eol)
             [boost::bind (&fortran::parse_info::skipping_line_comment, &pi)])
        |
            // Continuation
            (spirit::eol >> spirit::ascii::blank >> 
             spirit::qi::repeat(4)[spirit::ascii::char_ - spirit::eol] >> ".")
            [boost::bind (&fortran::parse_info::skipping_continue, &pi)]

        |   
            // empty line 
            (spirit::eol >> 
             -(spirit::ascii::blank >> spirit::qi::repeat(0, 4)[spirit::ascii::char_ - spirit::eol] >> 
               *(spirit::ascii::blank) ) >> 
             &(spirit::eol | spirit::eoi))
            [boost::bind (&fortran::parse_info::skipping_empty, &pi)]
        |   
            // whitespace (this needs to be the last alternative).
            (spirit::ascii::space - spirit::eol)
            [boost::bind (&fortran::parse_info::skipping_space, &pi)]
        ;

I would advise against blindly using this yourself for line-oriented Fortran, as I ignore line numbers, and different compilers have different rules for valid comment and continuation characters.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文