是否可以在 Boost.Spirit 中创建自定义解析器?

发布于 2024-09-14 12:49:59 字数 3264 浏览 9 评论 0 原文

我试图在 Boost.Spirit (2.3) 中创建一个自定义解析器类,但没有成功。代码是:

template <class Iter>
class crule : public boost::spirit::qi::parser<crule<Iter> >
{
  rule<Iter> r_;
public:
  crule(const rule<Iter>& r) : r_(r) {}
  template <class T>
  crule(const T& t) : r_(t) {}
  template<class Ctx, class Skip>
  bool parse(Iter& f, const Iter& l, Ctx& context, Skip& skip, typename rule<Iter>::template attribute<Ctx, Iter>::type& attr) const {
    return r_.parse(f, l, context, skip, attr);
  }
  template <class Ctx>
  boost::spirit::info what(Ctx& context) const {
    return r_.what(context);
  }
  template <class Context, class It>
  struct attribute {
    typedef typename rule<Iter>::template attribute<Context, It>::type type;
  };
};

虽然我已经(至少我认为我已经)满足了所有要求,当我尝试在解析表达式中使用此类时出现错误:

shell_grammar.h:134: error: no match for 'operator!' in '!shell_grammar<Iter>::token(boost::spirit::qi::rule<Iter, boost::fusion::unused_type, boost::fusion::unused_type, boost::fusion::unused_type>) [with Iter = __gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >](boost::spirit::qi::rule<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::fusion::unused_type, boost::fusion::unused_type, boost::fusion::unused_type>(((const boost::spirit::qi::rule<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::fusion::unused_type, boost::fusion::unused_type, boost::fusion::unused_type>&)((const boost::spirit::qi::rule<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::fusion::unused_type, boost::fusion::unused_type, boost::fusion::unused_type>*)(&((shell_grammar<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > > >*)this)->shell_grammar<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::reserved_words)))))'

shell_grammar.h:134: note: candidates are: operator!(bool) <built-in>

我尝试查看其他解析集的实现(例如.not_predicate),但无法弄清楚它的工作原理是什么。

动机

我这样做的原因与这个问题有关。我想解析 POSIX shell 语言,它有特殊的词法规则。特别是,即使在词位中也必须应用“船长解析器”,但它必须与“短语级”船长解析器不同。这是 lexeme 指令无法做到的,并且 skip 不会预先跳过(AFAIK),这也是我所需要的。所以我想创建一个函数

something token(std::string);

来返回与令牌匹配的规则。一种方法是创建我自己的 rule 包装器,它将用作终端(因为 rule 单独不能用于其参考语义),另一种方法是创建一个新的解析器(即将是 proto 中的非终结符),并在其中实现 shell 的标记解析。

I was trying to create a custom Parser class in Boost.Spirit (2.3), but it didn't work out. The code is:

template <class Iter>
class crule : public boost::spirit::qi::parser<crule<Iter> >
{
  rule<Iter> r_;
public:
  crule(const rule<Iter>& r) : r_(r) {}
  template <class T>
  crule(const T& t) : r_(t) {}
  template<class Ctx, class Skip>
  bool parse(Iter& f, const Iter& l, Ctx& context, Skip& skip, typename rule<Iter>::template attribute<Ctx, Iter>::type& attr) const {
    return r_.parse(f, l, context, skip, attr);
  }
  template <class Ctx>
  boost::spirit::info what(Ctx& context) const {
    return r_.what(context);
  }
  template <class Context, class It>
  struct attribute {
    typedef typename rule<Iter>::template attribute<Context, It>::type type;
  };
};

and although I have (at least I think I have) fulfilled all the requirements, I get errors when I try to use this class in a parsing expression:

shell_grammar.h:134: error: no match for 'operator!' in '!shell_grammar<Iter>::token(boost::spirit::qi::rule<Iter, boost::fusion::unused_type, boost::fusion::unused_type, boost::fusion::unused_type>) [with Iter = __gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >](boost::spirit::qi::rule<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::fusion::unused_type, boost::fusion::unused_type, boost::fusion::unused_type>(((const boost::spirit::qi::rule<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::fusion::unused_type, boost::fusion::unused_type, boost::fusion::unused_type>&)((const boost::spirit::qi::rule<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::fusion::unused_type, boost::fusion::unused_type, boost::fusion::unused_type>*)(&((shell_grammar<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > > >*)this)->shell_grammar<__gnu_cxx::__normal_iterator<const char*, std::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::reserved_words)))))'

shell_grammar.h:134: note: candidates are: operator!(bool) <built-in>

I tried to look at the implementation of other parsets (eg. not_predicate), but can't figure out what is the difference that makes it work.

Motvation

The reason I do it is related to this question. I want to parse POSIX shell language, which has peculiar lexical rules. Particularly, the "skipper parser" has to be applied even in lexemes, but it has to be different from the "phrase level" skipper parser. Which is what the lexeme directive can't do, and skip doesn't pre-skip (AFAIK), which is what I need, too. So I want to create a function

something token(std::string);

that would return a rule matching the token. One way is creating my own rule wrapper that would serve as a terminal (since rule alone cannot be used for its reference semantics), another would be creating a new parser (that would be a nonterminal in proto), and implement shell's token parsing in it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

余生再见 2024-09-21 12:49:59

这是很有可能的,但我发现它比手动编写自己的词法分析器和递归下降解析器要工作得多(并且更难调试)。即使是相当小的 Spirit 语法也会花费我数周的时间与编译器进行斗争。

您收到的此错误消息显示了您遇到的问题类型。任何时候你遇到错误,都是来自精神深处的某个模板实例化的错误,并且添加了许多进一步的模板实例化层来混淆问题。为了有希望破译错误消息,您几乎必须了解整个设施的代码。

我讨厌批评,因为精神是值得付出的努力。我的硕士论文是关于实现面向对象的编译器生成器,所以我'我是这个概念的粉丝。我真的很想喜欢它,但是 Spirit 太难了,除了严肃的 C++ 专家之外,任何人都无法使用。

要与可完成的操作进行比较,请查看 Ada OpenToken 项目。 Spirit 可能更灵活,但 OpenToken 中的编译错误更敏感,浏览该页面上的版本历史记录可以看出,他们的很大一部分精力都花在了帮助用户调试错误上。

It is quite possible, but I have found it to be as much work (and harder to debug) than just writing my own lexers and recursive descent parsers by hand. Even fairly small Spirit grammars can take me weeks of wrestling with the compiler.

This error message you got shows the kind of problems you run into. Any time you get an error, its an error from some template instantiation down deep in the bowels of Spirit, with many further layers of template instatiations added in to confuse matters. In order to have any hope of deciphering the error messages, you pretty much have to understand the code for the entire facility.

I hate to be critical, because Spirit is a worthy effort. I did my Master's thesis on implementing an object-oriented compiler-generator, so I'm a fan of the concept. I really wanted to like it, but Spirit is just too hard for anyone but serious C++ experts to use.

To compare with what can be done, take a look at the Ada OpenToken project. Spirit is probably more flexible, but compile errors are much more sensible in OpenToken, and a glance through the version history on that page shows a very large percentage of their effort has been put into helping users debug errors.

鹿港小镇 2024-09-21 12:49:59

您提供的代码看起来不错(至少就实际解析器的接口而言)。但为了将自定义解析器与 Spirit 集成,您需要做更多的工作。 Spirit 的网站有一个自定义解析器组件的示例,解释了所有必需的步骤此处

在我看来,你似乎不必要地试图以困难的方式做事。但我不完全理解你想要实现的目标,所以我可能是错的。如果您解释了您的用例,我相信我们可以提出一个更简单的解决方案。

The code you provided looks ok (at least as far as the interface of the actual parser is concerned). But in order to integrate a custom parser with Spirit you need to do more work. Spirit's website has an example for a custom parser component explaining all required steps here.

It looks to me as if you were unnecessarily trying to do things the hard way. But I don't fully understand what you're trying to achieve, so I might be wrong. If you explained your use case I'm sure we could come up with a simpler solution.

寒江雪… 2024-09-21 12:49:59

顺便说一句,这就是我想到的:

您需要将类注册为 boost::proto 中的文字,如下所示:

template <class T>
struct crulexx : public boost::proto::literal<rule<T> >
{
  template <class U>
  crulexx(const U& u) : boost::proto::literal<rule<T> >(rule<T>(u)) {}
};

它在这个 测试。但是,我在使用它的其他代码中遇到了段错误,我必须对其进行调试。

BTW this is what I came up to:

You need to register the class as a literal in boost::proto like this:

template <class T>
struct crulexx : public boost::proto::literal<rule<T> >
{
  template <class U>
  crulexx(const U& u) : boost::proto::literal<rule<T> >(rule<T>(u)) {}
};

It works for me in this test. However, I got segfaults in other piece of code using it, which I will have to debug.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文