尝试编译 Spirit.Qi 解析器时出现问题

发布于 2024-10-17 08:54:30 字数 6150 浏览 13 评论 0原文

下面是一个完全独立的示例。问题似乎出在第 84-89 行 - 如果这些行被注释掉,则该示例可以编译。我试图解析的是文件的每一行,其中有五个以冒号分隔的项目,最后三个项目是可选的。单个函数采用 boost::filesystem::file,使用 boost.interprocess 吸收它,并解析它。

我想要解析的示例:

a:1
a:2:c
a:3::d
a:4:::e
a:4:c:d:e

结果应存储在 vector 中,file_line 是一个有五个成员的结构体,最后三个是可选的。以下是代码和错误:

来自 MSVC 10 的代码

#if defined(_MSC_VER) && (_MSC_VER >= 1020)
# pragma warning(disable : 4512) // assignment operator could not be generated
# pragma warning(disable : 4127) // conditional expression is constant
# pragma warning(disable : 4244) // 'initializing' : conversion from 'int' to 'char', possible loss of data
#endif

#include <boost/fusion/adapted/struct/adapt_struct.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/home/qi.hpp>
#include <boost/spirit/home/qi/string.hpp>
#include <boost/spirit/home/karma.hpp>
#include <boost/spirit/home/karma/binary.hpp>
#include <boost/spirit/home/phoenix.hpp>
#include <boost/spirit/home/phoenix/bind.hpp>
#include <boost/spirit/home/phoenix/core.hpp>
#include <boost/spirit/home/phoenix/operator.hpp>
#include <boost/spirit/home/phoenix/statement/sequence.hpp>
#include <boost/fusion/include/std_pair.hpp>
#include <boost/interprocess/file_mapping.hpp>
#include <boost/interprocess/mapped_region.hpp>
#include <boost/filesystem/operations.hpp>

#include <string>

// This struct and fusion adapter is for parsing file servers in colon-newline format. 
struct file_line
{
  std::string a;
  unsigned short b;
  boost::optional<std::string> c;
  boost::optional<std::string> d;
  boost::optional<std::string> e;
};
BOOST_FUSION_ADAPT_STRUCT(
  file_line,
  (std::string, a)
  (unsigned short, b)
  (boost::optional<std::string>, c)
  (boost::optional<std::string>, d)
  (boost::optional<std::string>, e)
)

void
import_proxies_colon_newline(const boost::filesystem::path& file)
{
  using namespace boost::spirit;
  using qi::parse;
  using qi::char_;
  using qi::eol;
  using qi::eoi;
  using qi::lit;
  using qi::ushort_;

  // <word>:<ushort>:[word]:[word]:[word]
  if(boost::filesystem::exists(file) && 0 != boost::filesystem::file_size(file))
  {
    // Use Boost.Interprocess for fast sucking in of the file. It works great, and provides the bidirectional
    // iterators that we need for spirit.
    boost::interprocess::file_mapping mapping(file.file_string().c_str(), boost::interprocess::read_only);
    boost::interprocess::mapped_region mapped_rgn(mapping, boost::interprocess::read_only);

    const char*       beg = reinterpret_cast<char*>(mapped_rgn.get_address());
    char const* const end = beg + mapped_rgn.get_size();

    // And parse the data, putting the results into a vector of pairs of strings.
    std::vector<file_line> output;

    parse(beg, end,

          // Begin grammar
          (
            *(
                *eol
              >> +(char_ - (':' | eol) 
              >> ':' >> ushort_         
              >> -(':'
                    >> *(char_ - (':' | eol)) 
                    >> (eol | 
                          -(':'
                              >> *(char_ - (':' | eol)) 

                              // This doesn't work. Uncomment it, won't compile. No idea why. It's the same
                              // as above.
                              >> (eol |
                                    -(':'
                                        >>
                                        +(char_ - eol) 
                                      )
                                )
                          )
                        )
                  )
              >> *eol
            )
          )
          // End grammar, begin output data

          ,output
          );
  }
}

错误消息

由于问题限制为 30,000 个字符,因此我将仅在此处显示前几个字符。该示例应该尝试在您的计算机上编译并生成相同的内容。

1>C:\devel\dependencies\boost\boost-1_44\include\boost/spirit/home/support/container.hpp(101): error C2955: 'boost::Container' : use of class template requires template argument list
1>          C:\devel\dependencies\boost\boost-1_44\include\boost/concept_check.hpp(602) : see declaration of 'boost::Container'
1>          C:\devel\dependencies\boost\boost-1_44\include\boost/spirit/home/qi/operator/kleene.hpp(65) : see reference to class template instantiation 'boost::spirit::traits::container_value<Container>' being compiled
1>          with
1>          [
1>              Container=char
1>          ]
1>          C:\devel\dependencies\boost\boost-1_44\include\boost/spirit/home/qi/detail/fail_function.hpp(38) : see reference to function template instantiation 'bool boost::spirit::qi::kleene<Subject>::parse<Iterator,Context,Skipper,Attribute>(Iterator &,const Iterator &,Context &,const Skipper &,Attribute &) const' being compiled
1>          with
1>          [
1>              Subject=boost::spirit::qi::difference<boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::char_,boost::spirit::char_encoding::standard>>,boost::spirit::qi::alternative<boost::fusion::cons<boost::spirit::qi::literal_char<boost::spirit::char_encoding::standard,true,false>,boost::fusion::cons<boost::spirit::qi::eol_parser,boost::fusion::nil>>>>,
1>              Iterator=const char *,
1>              Context=const boost::fusion::unused_type,
1>              Skipper=boost::fusion::unused_type,
1>              Attribute=char
1>          ]

...剪...

1>C:\devel\dependencies\boost\boost-1_44\include\boost/spirit/home/support/container.hpp(102): fatal error C1903: unable to recover from previous error(s); stopping compilation

Below is a fully self-contained example. The problem appears to be lines 84-89 - if those lines are commented out, the example compiles. What I'm trying to parse is each line of a file, with five colon-delimited items, with the last three items being optional. The single function takes a boost::filesystem::file, sucks it in using boost.interprocess, and parses it.

Examples of what I want this to parse:

a:1
a:2:c
a:3::d
a:4:::e
a:4:c:d:e

The results should store in the vector<file_line>, and file_line is a struct with five members, the last three being optional. Here is the code, and the errors:

Code

#if defined(_MSC_VER) && (_MSC_VER >= 1020)
# pragma warning(disable : 4512) // assignment operator could not be generated
# pragma warning(disable : 4127) // conditional expression is constant
# pragma warning(disable : 4244) // 'initializing' : conversion from 'int' to 'char', possible loss of data
#endif

#include <boost/fusion/adapted/struct/adapt_struct.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/home/qi.hpp>
#include <boost/spirit/home/qi/string.hpp>
#include <boost/spirit/home/karma.hpp>
#include <boost/spirit/home/karma/binary.hpp>
#include <boost/spirit/home/phoenix.hpp>
#include <boost/spirit/home/phoenix/bind.hpp>
#include <boost/spirit/home/phoenix/core.hpp>
#include <boost/spirit/home/phoenix/operator.hpp>
#include <boost/spirit/home/phoenix/statement/sequence.hpp>
#include <boost/fusion/include/std_pair.hpp>
#include <boost/interprocess/file_mapping.hpp>
#include <boost/interprocess/mapped_region.hpp>
#include <boost/filesystem/operations.hpp>

#include <string>

// This struct and fusion adapter is for parsing file servers in colon-newline format. 
struct file_line
{
  std::string a;
  unsigned short b;
  boost::optional<std::string> c;
  boost::optional<std::string> d;
  boost::optional<std::string> e;
};
BOOST_FUSION_ADAPT_STRUCT(
  file_line,
  (std::string, a)
  (unsigned short, b)
  (boost::optional<std::string>, c)
  (boost::optional<std::string>, d)
  (boost::optional<std::string>, e)
)

void
import_proxies_colon_newline(const boost::filesystem::path& file)
{
  using namespace boost::spirit;
  using qi::parse;
  using qi::char_;
  using qi::eol;
  using qi::eoi;
  using qi::lit;
  using qi::ushort_;

  // <word>:<ushort>:[word]:[word]:[word]
  if(boost::filesystem::exists(file) && 0 != boost::filesystem::file_size(file))
  {
    // Use Boost.Interprocess for fast sucking in of the file. It works great, and provides the bidirectional
    // iterators that we need for spirit.
    boost::interprocess::file_mapping mapping(file.file_string().c_str(), boost::interprocess::read_only);
    boost::interprocess::mapped_region mapped_rgn(mapping, boost::interprocess::read_only);

    const char*       beg = reinterpret_cast<char*>(mapped_rgn.get_address());
    char const* const end = beg + mapped_rgn.get_size();

    // And parse the data, putting the results into a vector of pairs of strings.
    std::vector<file_line> output;

    parse(beg, end,

          // Begin grammar
          (
            *(
                *eol
              >> +(char_ - (':' | eol) 
              >> ':' >> ushort_         
              >> -(':'
                    >> *(char_ - (':' | eol)) 
                    >> (eol | 
                          -(':'
                              >> *(char_ - (':' | eol)) 

                              // This doesn't work. Uncomment it, won't compile. No idea why. It's the same
                              // as above.
                              >> (eol |
                                    -(':'
                                        >>
                                        +(char_ - eol) 
                                      )
                                )
                          )
                        )
                  )
              >> *eol
            )
          )
          // End grammar, begin output data

          ,output
          );
  }
}

Error Messages from MSVC 10

Since questions are limited to 30,000 characters, I'll only display the first few here. The example should attempt to compile and produce the same thing on your machine.

1>C:\devel\dependencies\boost\boost-1_44\include\boost/spirit/home/support/container.hpp(101): error C2955: 'boost::Container' : use of class template requires template argument list
1>          C:\devel\dependencies\boost\boost-1_44\include\boost/concept_check.hpp(602) : see declaration of 'boost::Container'
1>          C:\devel\dependencies\boost\boost-1_44\include\boost/spirit/home/qi/operator/kleene.hpp(65) : see reference to class template instantiation 'boost::spirit::traits::container_value<Container>' being compiled
1>          with
1>          [
1>              Container=char
1>          ]
1>          C:\devel\dependencies\boost\boost-1_44\include\boost/spirit/home/qi/detail/fail_function.hpp(38) : see reference to function template instantiation 'bool boost::spirit::qi::kleene<Subject>::parse<Iterator,Context,Skipper,Attribute>(Iterator &,const Iterator &,Context &,const Skipper &,Attribute &) const' being compiled
1>          with
1>          [
1>              Subject=boost::spirit::qi::difference<boost::spirit::qi::char_class<boost::spirit::tag::char_code<boost::spirit::tag::char_,boost::spirit::char_encoding::standard>>,boost::spirit::qi::alternative<boost::fusion::cons<boost::spirit::qi::literal_char<boost::spirit::char_encoding::standard,true,false>,boost::fusion::cons<boost::spirit::qi::eol_parser,boost::fusion::nil>>>>,
1>              Iterator=const char *,
1>              Context=const boost::fusion::unused_type,
1>              Skipper=boost::fusion::unused_type,
1>              Attribute=char
1>          ]

...snip...

1>C:\devel\dependencies\boost\boost-1_44\include\boost/spirit/home/support/container.hpp(102): fatal error C1903: unable to recover from previous error(s); stopping compilation

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

小兔几 2024-10-24 08:54:30

我已经在 Spirit 邮件列表上进行了回答,但为了完整起见,我也将其发布在这里。


你的例子远非最小。我不明白你为什么在代码中留下进程间、文件系统或 Karma 引用。对于每个愿意提供帮助的人来说,这只会使诊断变得更加困难。此外,你的某个地方有一个不匹配的括号。我假设您错过了关闭 +(char_ - (':' | eol)

好的,让我们仔细看看。这是您的(简化的)语法。它不再做任何有用的事情,但属性-明智的是,它的行为应该与原始语法相同:

*(+char_ >> -(*char_ >> (eol | -(*char_ >> (eol | -(':' >> +char_))))))

此语法的公开(传播属性)是:

vector<
  tuple<
    std::vector<char>,
    optional<
      tuple<
        std::vector<char>,
        variant<
          char,
          optional<
            tuple<
              std::vector<char>,
              variant<
                char,
                optional<
                  std::vector<char>
                >
              >
            >
          >
        >
      >
    >
  >
>

属性兼容性规则可以做很多事情,但它们不能将 std::string 映射到 variantvariantchar, vector> 而且,我相信你自己已经不再理解你的语法了,为什么你期望 Spirit 在这种情况下能得到正确的结果呢

?通过将事物分解为规则来简化语法,这不仅使语法更容易理解,而且还允许您告诉 Spirit 您希望从语法的哪个子部分返回什么属性,例如:

rule<char const*, std::string()> e1 = +~char_(":\r\n");
rule<char const*, std::string()> e2 = *~char_(":\r\n");
rule<char const*, std::string()> e3 = +~char_("\r\n");
rule<char const*, ushort()> u = ':' >> ushort_;
rule<char const*, file_line()> fline = 
    *eol >> e1 >> u
         >> -(':' >> e2 >> (eol | -(':' >> e2 >> (eol | -(':' >> e3))))) >> *eol;

这使得整体语法更具可读性。 :

*fline

漂亮,嗯?

如果你进一步思考,你会发现,写作

foo >> (eol | -bar) >> *eol

相当于:

foo >> -bar >> *eol

这更加简化了:

rule<char const*, file_line()> f = 
    *eol >> e1 >> u >> -(':' >> e2 >> -(':' >> e2 >> -(':' >> e3) ) ) >> *eol;

你现在可以看到你的语法产生至少 5 个子属性,而你的 file_list 只有四名成员。您需要相应地调整 file_list 结构。

上面的代码现在可以编译(Boost SVN trunk),但无法产生正确的结果。如果我用 "a:4:c:d:e" 提供它,我会得到结果:output[0].a == "a", 输出[0].b == 4,以及输出[0].c ==“cde”。我们来分析一下为什么会出现这种情况。

同样,属性兼容性规则只能完成部分工作。在这种情况下,file_list::a 映射到 e1file_list::b 映射到 u,而 file_list::c 被映射到表达式的整个其余部分。实际上,这就是您所期望的,因为可选将序列分为 3 个元素。你的属性是“扁平化”的,而语法却不是。

有两种解决方案:a)更改属性以匹配语法结构:

struct file_line
{
  std::string a;
  unsigned short b;
  boost::optional<
    fusion::vector<
      std::string, 
      boost::optional<
        fusion::vector<std::string, boost::optional<std::string> >
      >
    >
  > c;
};

或b)使用语义操作来设置属性的元素(这就是我要做的)。

I already answered on the Spirit mailing list, but let me post it here for the sake of completeness as well.


Your example is far from minimal. I see no reason why you left interprocess, filesystem or Karma references in the code. This just makes diagnosing things so much more difficult for everybody willing to help. Moreover you have a mismatched parenthesis in there somewhere. I assume you missed to close the +(char_ - (':' | eol).

Ok, let's look closer. This is your (simplified) grammar. It does not do anything useful anymore, but attribute-wise it should behave the same as the original one:

*(+char_ >> -(*char_ >> (eol | -(*char_ >> (eol | -(':' >> +char_))))))

The exposed (propagated attribute) of this grammar is:

vector<
  tuple<
    std::vector<char>,
    optional<
      tuple<
        std::vector<char>,
        variant<
          char,
          optional<
            tuple<
              std::vector<char>,
              variant<
                char,
                optional<
                  std::vector<char>
                >
              >
            >
          >
        >
      >
    >
  >
>

Attribute compatibility rules can do quite a bit, but they can't map a std::string onto a variant<char, vector<char> > for sure. Moreover, I believe you do not understand your grammar yourself anymore, why do you expect Spirit to get it right in this case?

What I'd suggest is that you start with simplifying your grammar by outfactoring things into rules. That not only makes it easier to understand, but allows you to tell Spirit what attribute you expect to get back from what subpart of your grammar. For instance:

rule<char const*, std::string()> e1 = +~char_(":\r\n");
rule<char const*, std::string()> e2 = *~char_(":\r\n");
rule<char const*, std::string()> e3 = +~char_("\r\n");
rule<char const*, ushort()> u = ':' >> ushort_;
rule<char const*, file_line()> fline = 
    *eol >> e1 >> u
         >> -(':' >> e2 >> (eol | -(':' >> e2 >> (eol | -(':' >> e3))))) >> *eol;

which makes the overall grammar more readable already:

*fline

pretty, huh?

If you think about it further, you will realize, that writing

foo >> (eol | -bar) >> *eol

is equivalent to:

foo >> -bar >> *eol

which simplifies it even more:

rule<char const*, file_line()> f = 
    *eol >> e1 >> u >> -(':' >> e2 >> -(':' >> e2 >> -(':' >> e3) ) ) >> *eol;

What you can see now is that your grammar produces at least 5 sub-attributes, while your file_list has only four members. You need to adjust your file_list structure accordingly.

The above does compile now (Boost SVN trunk), but it fails producing the correct results. If I feed it with "a:4:c:d:e", I get the results: output[0].a == "a", output[0].b == 4, and output[0].c == "cde". Let's analyze why that happens.

Again, attribute compatibility rules can do only part of the work. In this case file_list::a gets mapped onto e1, file_list::b onto u, while file_list::c gets mapped onto the whole rest of the expression. That's what you would expect, actually, as the optional breaks the sequence into 3 elements. Your attribute is 'flattened', while the grammar is not.

There are two solutions: a) change your attribute to match the structure of the grammar:

struct file_line
{
  std::string a;
  unsigned short b;
  boost::optional<
    fusion::vector<
      std::string, 
      boost::optional<
        fusion::vector<std::string, boost::optional<std::string> >
      >
    >
  > c;
};

or b) use semantic actions to set the elements of your attribute (which is what I would do).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文